Robert Boyle (1627—1691)

Robert Boyle was one of the most prolific figures in the scientific revolution and the leading scientist of his day. He was a proponent of the mechanical philosophy which sought to explain natural phenomena in terms of matter and motion, rather than appealing to Aristotelian substantial forms and qualities. He was a champion of experimental science, claiming that theory should conform to observation and advocating openness in the publication of experimental results, the replication of experiments for empirical corroboration, and the importance of recording even those experiments that failed, at a time when these ideas were revolutionary. He defended and developed the distinction between primary and secondary qualities and supported it with detailed experimental evidence. With the help of his colleague Robert Hooke (1635-1703), he designed and improved an air pump capable of creating and sustaining a vacuum and used it to perform many famous experiments, investigating things like respiration, disease, combustion, sound, and air pressure. He discovered Boyle’s law, which shows that the volume and pressure of a gas are proportionally related. He used empirical evidence to refute both the four-element theory of Aristotle and the more recent three-principle theory of Paracelsus (1493-1541). Finally, many historians of science consider him to be the father of modern chemistry.

This article focuses on the philosophical significance of Boyle’s work, but it is important to note that Boyle was a polymath with diverse interests ranging from animal husbandry to underwater respiration, from the study of ancient languages to finding ways of extending the human lifespan. Furthermore, Boyle had both the intellect and the financial resources to pursue such a wide research agenda. Focusing on his philosophy, or even his chemistry, runs the risk of ignoring the true complexity of his thought. Nevertheless, much of Boyle’s work has had enduring philosophical significance.

Table of Contents

  1. Life
  2. Natural Philosophy
    1. Rejection of Aristotelianism
    2. The Mechanical Philosophy
    3. Chemistry
    4. Alchemy
    5. Medicine
    6. Pneumatics
  3. Philosophy of Science
  4. Substance Dualism
  5. Causation
  6. God
  7. Ethics
  8. Casuistry
  9. References and Further Reading
    1. Recent Editions of Boyle’s Works
    2. Chronological List of Boyle’s Publications
    3. Correspondence
    4. Work Diaries
    5. Biographies
    6. Selected Works on Boyle
    7. Other Important Works

1. Life

Robert Boyle was born on the 25th of January, 1627, at Lismore Castle, County Waterford, Ireland. He was the fourteenth child of Richard Boyle, the first Earl of Cork, who had come to Ireland from Canterbury, essentially penniless, in 1588. By the time of Boyle’s birth, through a series of shrewd and sometimes shady real estate ventures, Cork had become the wealthiest man in Ireland. This incredible wealth can be seen in Boyle’s lavish upbringing and education. After the death of his mother in 1630, Boyle’s daily care and supervision went to a local Irish woman, known today only as Nurse Allen. Allen raised Boyle, teaching him the Irish language, until his eighth year when he was sent away, along with his brother Francis, for a formal education at Eton.

After only three years at Eton, Cork decided to send Boyle, along with his brother Francis, on a grand tour of the continent under the tutelage of Isaac Marcombes. Marcombes was a renowned teacher from Switzerland and had just returned from a similar tour in which he had tutored Boyle’s older brothers. Boyle spent most of the tour in Geneva, at Marcombes’s home, where he studied a variety of subjects, including French, Latin, Italian, geometry, Roman history, philosophy, tennis, fencing, and horseback riding.

During his initial stay in Geneva in 1641, Boyle had a life-changing experience.  One night during a terrible storm, he thought the Day of Judgment had come and that he had wasted his life on trivial pursuits. Boyle made an oath that he would dedicate himself to the Christian service of humanity if he was allowed to survive. The next morning, after the storm had passed, the young Boyle swore the oath again to demonstrate his sincerity. For the rest of his life he dedicated himself to various charitable endeavors. Even much of his later scientific work was directly motivated by what Boyle perceived as his religious duty. This event also led Boyle to a renewed dedication to his studies, as well as a lifelong aversion to swearing oaths. Later in life, for example, he declined the presidency of the Royal Society because it required swearing an oath. He even wrote a treatise, A Free Discourse Against Customary Swearing (1695).

During the grand tour, Boyle also travelled in France and Italy. They tried to visit Galileo, and Boyle studied Italian to read Galileo’s works in preparation, but the great scientist died before Boyle could meet him. The grand tour came to an end when Boyle received the news that his father had died. After sufficient finances were secured, Boyle returned to England and eventually settled at the family estate at Stalbridge, where he devoted himself to writing chivalric romances, a common literary form at the time, and moral treatises.

It is hard to determine when Boyle developed a serious interest in natural philosophy, but a few events are noteworthy. The Boyle scholar Michael Hunter puts it in the early1650s, warning against an interpretation that makes it seem inevitable that Boyle would become a scientist. However, we should not ignore events in Boyle’s life that indicate an early interest in natural philosophy, and the more one looks into the matter, the more a steady interest in natural philosophy becomes apparent. While it was not inevitable that Boyle would become a scientist, neither is it surprising.

Boyle had been familiar with the work of Aristotle, Bacon, and Galileo since his days at Eton. As early as 1646, events in Boyle’s life show an increasing interest in chemistry. An important letter to his sister Katherine Ranelagh (1615-1691) from May of that year shows that Boyle made a serious attempt to design and construct a chemical laboratory at Stalbridge. The attempt was unsuccessful, since an essential furnace was delivered “crumbled into as many pieces as we are into sects!” But the attempt itself is sufficient evidence of a serious interest in natural philosophy. Nevertheless, ethics was still Boyle’s primary philosophical concern during this period.

A trip to Leiden to attend his brother’s wedding in 1648 is also pertinent because at that time there was a thriving intellectual community of natural philosophers, with multiple schools of anatomy and the controversial mechanical philosophy of Rene Descartes (1596-1650) being discussed all over Holland. During this trip Boyle visited the University of Leiden and viewed an experiment on the nature of light in which the image of the city was projected onto the wall in the room of a high tower. This event may be the cause of the once-common view that Boyle studied there. However, these early experiences do pale in importance next to a conversion experience Boyle had in the early 1650s, when he essentially became a scientist. Boyle had found a way to combine his interests in natural philosophy with his pledge to dedicate his life to philanthropic pursuits.

He was encouraged in these endeavors by his older sister and best friend, the Lady Katherine Ranelagh (1615-1691). His relationship with Ranelagh would be the closest one of his life. Ranelagh was an important natural philosopher in her own right, respected and consulted by her contemporaries, who found a way to pursue her scientific research within the confines of the strict gender norms of seventeenth-century England. Later in life, at her London estate on Pall Mall, she would become Boyle’s intellectual companion, editor, and most trusted collaborator. In the early 1650s, her main contribution to Boyle’s philosophical development was her participation in the Hartlib Circle.

Samuel Hartlib (1600-1662) was a German polymath who moved to England in 1628 and recruited intellectuals and experts in all sorts of fields for a variety of religious philanthropic endeavors, including projects in medicine, public education, agriculture, animal husbandry, and translations of the Bible into other languages (Boyle would eventually help with projects to translate the Bible into Irish, Malay, and Algonquin). The members of the circle included Heinrich Appelius, Friedrich Clodius, Cheney Culpeper, John Dury, Theordore Haack, Godofred Hotton, Joachim Hubner, Katherine Ranelagh, Johann Moriaen, John Pell, William Petty, Johann Rulicius, John Sadler, George Starkey, and Benjamin Worsley. Hartlib, like Marine Mersenne (1588-1648), had a vast network of correspondence, with so many individuals that it is hard to establish a comprehensive list. However, it is important to note that by the time Boyle began participating in the circle, Ranelagh was already an established member. Furthermore, Ranelagh was a very important member, since out of the 766 names mentioned in Hartlib’s correspondence, Ranelagh’s is the sixth most mentioned. The group’s activities were inspired by the utopian writings of Francis Bacon (1561-1626), and it is Bacon who would have the single greatest influence on Boyle’s philosophy. The Hartlib Circle became a prototype of the modern scientific research society. It was eventually replaced by formal scientific societies, such as the Royal Society, of which Boyle was a founding member.

In 1652, Boyle briefly returned to Ireland to settle matters involving his inheritance. Although he was in Ireland only a short time, Hartlib recruited Boyle to work on a number of projects. Boyle was asked to create a Baconian natural history of Ireland, and research ways of developing new agricultural and animal husbandry techniques there, but these projects never got off the ground. By now, Boyle’s primary interest was to learn the empirically oriented chemistry of Jean Baptise Van Helmont (1580-1644). He was being helped in this endeavor through correspondence with the American alchemist George Starkey (1628-1665). However, unable to establish a chemical laboratory in Ireland, Boyle spent his time reading up to 12 hours a day and learning anatomy from William Petty (1623-1687), who had learned anatomy in Leiden before teaching it at Oxford, then following Cromwell to Ireland as Physician General.

Boyle’s serious investigations into natural philosophy really began when he became Starkey’s pupil. In Alchemy Tried in the Fire: Starkey, Boyle, and the Fate of Helmontian Chymistry (2002), William Newman and Lawrence Principe present a detailed analysis of Starkey’s influence on Boyle’s chemical education. They suggest using the term Chymistry to refer to the general group of issues concerning alchemy and chemistry in the early modern period, noting that the terms were then often used synonymously, while they have very different connotations in contemporary discourse.

Starkey was greatly influenced by Van Helmont, and Boyle eventually replicated many of Van Helmont’s experiments. By the time Boyle returned to England he was thoroughly absorbed in natural philosophy, wasting little time in moving to Oxford, networking with other scientists, and establishing the laboratory for which he is now famous. From this point, and for the rest of his life, Boyle was constantly conducting experiments. His published works, correspondence, and work notes—many of which survive—became full of detailed accounts of them. Boyle spent this important part of his career in one of the most thriving intellectual environments in the world at the time, working on a variety of projects involving both chemical analysis as well as experiments involving medicine, pneumatics, and hydraulics. He became involved with a group of like-minded, anti-Aristotelian, natural philosophers, which included John Locke (1632-1704) and eventually Isaac Newton (1643-1727), who regarded Boyle’s work on pneumatics as a paradigm of science.

The natural philosopher Robert Hooke (1635-1703) began his career as Boyle’s laboratory assistant. Together, they made improvements on the air-pump design made by Otto von Guericke (1602-1686), and produced a machine capable of evacuating most of the air from an observable glass chamber. They did a large number of experiments with it, and by presenting these to noble and socially influential audiences, they produced useful publicity for the scientific activities of the Royal Society. Inspired by Bacon’s conception of science, Boyle developed and used new technological instruments that enabled detailed, replicable observations which he thought revealed the hidden structure of the natural world.

Boyle also became close friends with the young John Locke, who went to Oxford in 1652 to study medicine. They even worked on a few medical projects together. Boyle had a significant influence on Locke’s philosophical development, including his distinction between primary and secondary qualities, and the difference between real and nominal essences.

Some of Boyle’s scientific claims were criticized by Thomas Hobbes (1588-1679), and the two philosophers became involved in a heated public debate over the role of experimental observation in natural philosophy. Poor health caused Boyle to move to London in 1668. There he lived with his sister Katherine for the rest of his life. For over twenty years they worked together on various projects in medicine, natural philosophy, and philanthropy. They received many important visitors who would come to witness his famous experiments.

Boyle died of grief a week after the death of his beloved sister, on December 31, 1691.     Locke was the executer of his estate. He left funds to establish a series of annual lectures to defend Christianity against objections to its basic tenets. The lectures continue to this day.

2. Natural Philosophy

Boyle considered natural philosophy to be an important part of philosophy. He believed God gave humans three books to aid in their salvation: “the book of scripture,” “the book of conscience, and “the book of nature.” In works such as Of the Study of the Book of Nature and Some Considerations Touching the Usefulness of Experimental Natural Philosophy (1663), Boyle argues that the natural world had been not only intentionally designed by God, but had been designed specifically to be understood, at least in part, by rational human minds. He believed that humans equipped with reason could make use of detailed observation, under controlled experimental conditions, to uncover the hidden structure of nature. Boyle’s efforts to bring chemistry out of the disreputable shadows of alchemy, as well as all sorts of other projects he undertook in natural philosophy, were justified as part of the theologically acceptable study of the natural world, God’s great automaton, the study of which Boyle believed too many people neglected. Boyle saw it as a religious duty to investigate natural phenomena and publish the knowledge he gained for the benefit of humanity. This Baconian approach to science can be seen throughout his research, including his chemical analyses of medicines, his investigations of air pressure, his study of human anatomy, his invention of the friction match, his efforts to expand the human lifespan, and even his work to advance agriculture and animal husbandry techniques.

Boyle seems to have spent nearly equal time doing experimental natural philosophy, studying the Bible as well as the ancient languages associated with it, and analyzing his own conscience. He put the same intellectually rigorous effort, aided by significant financial resources, into all three. Boyle’s entire philosophy, his metaphysics, his epistemology, and his ethics, are all intertwined with these three religiously motivated projects. Though Boyle is known today mostly for his work in various areas of natural philosophy, these achievements cannot be fully appreciated without understanding their place in Boyle’s religion.

It is important to emphasize that Boyle’s approach to natural philosophy, though influenced by Descartes, is more explicitly indebted to philosophers such as Francis Bacon and Pierre Gassendi (1592-1655). In the article “Pacere Nominibus: Boyle, Hooke and the Rhetorical Interpretation of Descartes” (1994), Edward Davis explores Descartes’s influence on Boyle during the 1660s, under the influence of Hooke, who taught Boyle Cartesian philosophy. However, it is misleading to describe Boyle as a Cartesian.

Descartes’s influence both occurred earlier and was also less extensive than this view implies. Boyle read Descartes’s Passions of the Soul in 1648, before his association with Hooke. While this has been downplayed as a minor work compared to Descartes’s Meditations on First Philosophy and the Principles of Philosophy, it does give an accurate and succinct presentation of Descartes’s philosophy, including his mechanical account of the human body. Furthermore, along with the works of Galileo and Gassendi, it represents one of Boyle’s earliest exposures to the mechanical philosophy. And while Boyle does later present many of his views in Cartesian terms and agrees with his basic dualistic and theist ontology, there are fundamental differences between their philosophies, such as their views on the essence of matter, the possibility of a vacuum, the role of experiment in science, and the possibility and extent of knowledge based on experience. On the other hand, Boyle had been exposed to Bacon’s conception of science since his time at Eton. The provost of Eton, Henry Wotton, was Bacon’s cousin. Furthermore, Bacon’s influence can be seen in the work of many of the members of the Hartlib Circle. Thus, it is more accurate to say that in natural philosophy Boyle was primarily a Baconian who agreed with Gassendi on many important issues, Descartes on others, and often expressed his ideas in Cartesian terms.

In works such as A Discourse on Things Above Reason (1681) and On the High Veneration Man’s Intellect owes to God (1684), Boyle distinguishes between demonstrative rational arguments and what can be inductively inferred from experience. Like Bacon, Boyle believed that theory should conform to observation. He tried to avoid premature metaphysical speculation—with mixed results—in favor of theories that could be tested by experiment. He agreed with Bacon’s claim in Novum Organum (1620) that the hidden structure of the natural world is too subtle to be penetrated by the Aristotelian, deductive approach to science, and that technology can aid in our investigation of the natural world. Boyle thought this approach yielded new scientific information that could be potentially used for the benefit of humanity. Boyle tried to put into practice something like the science Bacon envisioned in works such as Novum Organum (1620), and New Atlantis (1627).

Many areas of Boyle’s philosophy are intimately connected to his natural philosophy, including his rejection of scholastic Aristotelianism, his acceptance of the corpuscular mechanical philosophy, his work in chemistry, alchemy, medicine, and pneumatics, as well as his philosophical views regarding the nature of knowledge, perception, substance, real and nominal essences, causation, and alternative possible worlds.

a. Rejection of Aristotelianism

Central to Boyle’s natural philosophy is his general rejection of scholastic Aristotelianism. In works such as About the Excellency and Grounds of the Mechanical Hypothesis (1674), he rejects Aristotle’s theory of motion as the actualization of a potential, as well as his distinction between natural and unnatural motion, holding that the local motion involved in the mechanical interactions of corpuscles was inherently more intelligible. He also rejected the scholastic notion of substantial form and used controlled experiments to investigate the Aristotelian terrestrial elements, forms, and qualities. For example, Boyle was the first philosopher to write an entire book about cold, a property the scholastics claimed to be one of the four primary qualities of matter, but had actually only discussed in the most general terms. Boyle’s book included all sorts of experiments he had conducted on the nature of cold, each described in meticulous detail.

Boyle rejected the scholastics’ deductive syllogistic approach to science. He agreed with Bacon that the natural world was too complex for the categorical syllogism to penetrate. He thought that scientific progress requires an inductive method that posits a hypothesis that can then be tested by experiment involving multiple controlled observations. Because the theories could be modified in light of new empirical evidence, Boyle believed the experimental method was fundamentally superior to the scholastic syllogistic model of science.

Boyle’s rejection of scholastic Aristotelianism in works such as The Sceptical Chymist (1661), and The Origin of Forms and Qualities (1666), was also based in part on his acceptance of the mechanical philosophy. This early modern philosophical movement sought to explain natural phenomena in terms of matter and motion, rather than, for example, the composition and proportion of Aristotelian terrestrial elements. Boyle thought mechanical explanations were inherently more intelligible than explanations based on the elemental model because they appealed to properties which themselves were more intelligible, such as size, shape, and motion, rather than to ultimately obscure causes such as real qualities or substantial forms. For Boyle, generation, corruption, and alteration could all be explained mechanically, as various types of interaction between microscopic particles of matter he called corpuscles.

This rejection of the elemental model of explanation also extended to other theories of natural philosophy that were popular in his day, such as the alchemical theory of Paracelsus (1493-1541), involving three chemical “principles”: salt, sulfur, and mercury, as well as even the more recent five-element theories of chemists such as Nicolas Le Fevre (1615-1669). When fire-analysis experiments revealed that some compound bodies could be reduced to five, rather than only four, homogenous elements, some natural philosophers thought this was evidence of a fifth element. Boyle rejected the elemental explanatory model altogether. Instead, he argued that there was only one kind of material substance, and what appear at the macroscopic level to be different elements are actually structural modifications of this universal matter’s mechanical properties.

In a similar way, Boyle also rejected the Aristotelian notion of natural motion. In Book 8 of the Physics, Aristotle argued that each element has a natural location in the universe and a natural tendency to return to this location. This was used to explain such things as why rocks fall and smoke rises. In contrast, Boyle argued that all matter was essentially passive and insensible, lacking any tendencies or dispositions beyond its mechanical properties. Matter can be acted upon but contains no internal force, source of motion, substantial form, or disposition.

The traditional Aristotelian qualities of hot, cold, wet, and dry could be mechanically explained in a similar way. For example, Boyle thought that heat was not a primary quality of matter, but instead a property that is reducible to a particular type of rapid corpuscular motion. The conception of heat as molecular motion is a direct descendent of this view. In a similar way, Boyle thought the power of a key to open a lock is not due to some real quality, substantial form, or occult power of the key; rather, it is an emergent power, ultimately reducible to the size, shape, and motion of the key and the lock, which Boyle called their mechanical affections.

It is important to note that Boyle’s objections against Aristotelian natural philosophy were usually directed more toward the views of his contemporary scholastics, such as Julius Caesar Scaliger (1484-1558), than those of Aristotle himself, for whom he had great respect. Boyle’s approach to ethics, for example, shows this respect was more than lip service, since it provides what is essentially an Aristotelian analysis of the causes of moral virtue. It is also important not to conflate the mechanical philosophy, the corpuscular hypothesis (Boyle’s own version of the mechanical philosophy), and the experimental philosophy (the method by which Boyle often tested theories).

b. The Mechanical Philosophy

Boyle coined the term Mechanical Philosophy and used it to describe any attempt to explain natural phenomena in terms of matter and motion, rather than in terms of substantial forms, real properties, or occult qualities. For Boyle, this included the work of a wide variety of philosophers that otherwise differed in many respects. His list of mechanical philosophers included the ancient atomists Democritus, Leucippus, Epicurus, and Lucretius—names synonymous with atheism at the time—as well as his contemporaries Galileo, Descartes, Gassendi, Hobbes, Locke, and Newton.

Boyle’s own corpuscular version of the mechanical philosophy makes him both an empirical representationalist and an indirect realist. Though Galileo’s The Assayer (1923) is likely the first early modern work to raise the influential distinction between primary and secondary qualities, Boyle developed this distinction and made it an important part of his natural philosophy. In the Origin of Forms and Qualities, among other works, he argued that our senses provide a representation of an independently existing, external physical world, which is ultimately composed of material particles moving through empty space. Boyle held that these corpuscles have mechanical affections, properties such as size, shape, and motion, which are the primary qualities of matter, real properties that exist in any bit of material substance, no matter how small. The secondary qualities we perceive, such as color, sound, taste, odor, and warmth, are mental perceptions that are produced by these primary qualities causally interacting with our sense organs, but do not actually exist as real qualities in the object of perception itself. Thus, perception involves information about the external world entering the brain as a result of the causal interaction between the conscious perceiver and the object perceived.

Boyle used the term “corpuscle” to describe the microscopic material particles, and their clusters, of which he believed the material world was composed. Boyle thought God has the power to infinitely divide matter, even if this is beyond our rational comprehension, but the actual physical world is composed of minima or prima naturalia, microscopic particles of matter which never are, as a matter of fact, divided. These basic corpuscles interact and combine to form larger and larger clusters until they form the ordinary macroscopic material substances with which we are familiar.

On the surface, Boyle’s mechanical philosophy seems very similar to Descartes’s, but their views differ in several important respects. Briefly looking at their differences helps us understand the uniqueness of Boyle’s view. Descartes argues that the “attribute,” or essence, of matter is extension in space. He also held that there is no real distinction between a substance and its attribute. Just as there is no body that lacks extension, Descartes held that there is no extension that lacks body. Descartes held that the universe was a plenum, completely filled with material substance. He even thought that the famous mercury vacuum created by Evangelista Torricelli (1608-1647), while devoid of air, was filled with “subtle matter”—particles small enough to penetrate the pores of the glass—and that we could deduce the existence of such particles from the nature of matter itself.

Boyle agreed that all matter was extended in space, but he wasn’t committed to Descartes’s elegant metaphysical system. Boyle thought theory had to be subordinate to observation. Extension, rather than being the essence of matter through which all other properties were mere modifications, is only another empirically manifest mechanical affection like size, shape, texture, arrangement, and solidity. For Boyle, empty space is not only logically possible but also empirically corroborated by experiments like those performed by Torricelli, Otto von Guericke (1602-1686), and himself, with Robert Hooke. Boyle also thought motion in empty space was more intelligible than in a plenum. Descartes had to resort to a complex theory of circular motion to explain it. Until there is empirical evidence to support the existence of subtle matter, Boyle believed its postulation violated Ockham’s razor.

Boyle believed that mechanical explanations were inherently more intelligible than those of the Aristotelians or the Paracelsians because they involved easily understandable concepts like size, shape, and motion. He thought the local motion involved in the mechanical interaction of corpuscles is inherently more intelligible than the Aristotelian conception of motion as the actualization of a potential. Boyle thought the appeal to substantial forms in natural philosophy produced explanations that were vacuous when compared to mechanical explanations. The Paracelsians seemed no better, appealing to vague notions such as the “archeus,” “astral beings,” and “blas.” Furthermore, being firmly rooted in alchemy, they were often secretive and intentionally obscure. However, explanations that appealed only to mechanical properties were clear, intelligible, and often had the advantage of being empirically testable.

In About the Excellency and Grounds for the Mechanical Hypothesis (1674), Boyle points out that no one appeals to substantial forms when mechanical explanations are available, as, for example, when one is shown how the moon is eclipsed by the shadow caused by the position of the earth relative to the sun. Likewise, there is no reason to appeal to witchcraft to explain how a concave mirror can project the image of a man into the air, once catoptrics is understood. Boyle thought Aristotelians and Paracelsians failed to realize that this mechanical approach can be applied to natural phenomena in general.

Boyle was interested in occult qualities, natural phenomena in which the effect is observable, but the cause is not, such as magnetic and electrical attraction. Boyle thought such phenomena could be explained mechanically in terms of corpuscular effluvia, the emission of small corpuscular clusters. In A Discourse of Things above Reason (1681), though, Boyle also recognized that some phenomena cannot be mechanically explained. These included the miracles featured in the Bible, as well as more traditional philosophical problems such as whether or not matter is infinitely divisible, how mind-body interaction is possible, and how human free will and moral responsibility can be compatible with divine foreknowledge. Perhaps these might be explained by future philosophical investigation, but they resist straightforward mechanical explanation.

The influence of the mechanical philosophy can be seen throughout Boyle’s other intellectual endeavors and provides his basic approach to philosophy. This influence is apparent in his metaphysical views on the nature of substance and causation, his defense of the corpuscular hypothesis, his epistemological views on role of experiment in scientific explanation and the limits of reason, and his theological views on the importance of studying the book of nature and its potential for medicine.

c. Chemistry

Boyle is considered by many to be the father of modern experimental chemistry. Through years of diligent work he became a skilled chemist. His interest and work in chemistry lasted from the early 1650s to the end of his life. His social status and efforts to show that natural philosophy was a theologically acceptable pursuit did much to make the science of chemistry socially respectable. Boyle’s most important contribution to chemistry is his systematic critique of both the Aristotelian and Paracelsian theories of natural philosophy.

In The Sceptical Chymist (1661), Boyle points out the limitations of fire analysis as a universal method of separating compound substances into their homogenous components, a method many Aristotelians and Paracelsians used. For example, a green stick burned in open fire seems to separate into four homogenous parts, demonstrating its compound nature: The smoke was the element of air being separated, the hissing and snapping of the sap indicated the water element, the quantity of fire grew as the stick burned, and the remaining ash was the element of earth that was left. Pacracelsians had a similar explanation, separating the stick into the chemical principles of salt, sulfur and mercury.

Boyle thought the separation could be better explained by the rapid mechanical bombardment of corpuscles from the fire onto the structure of the corpuscles composing the stick, setting them in motion. Chemical analysis revealed that the smoke and ash are not homogenous elements but are compound bodies themselves. Some compound substances, such as gold, could be burned for extended periods at extreme temperatures without separating into other homogenous substances. Furthermore, chemical distillation of other compound substances, such as raisins, could produce five homogenous substances.

Boyle was able to chemically sublimate several substances, such as sulfur, turning them from a solid state to a gas and back without going through a liquid phase. Boyle thought such experiments had serious consequences for the elemental model since, according to it, the release of a gas involved the separation of an element or chemical principle, which would require a diminution of the whole. If a substance can be turned back and forth from a solid to a gas again and again without any sign of disintegration, then such a diminution clearly has not taken place. The only alternative explanation on the elemental model would be that the substance has transmuted back and forth into different elements. However, if this is the case, then neither can be considered a true element.

Inspired by Bacon’s utopian model of science, Boyle tried to compile “experimental histories” of different substances. Some of these projects led to completed works, such as An Essay about the Origin and Virtues of Gems (1672) and Short Memoirs for the Natural Experimental History of Mineral Waters (1685). Others, such as the Philosophical History of Minerals, never came to fruition, though much of the research was completed. These projects were records of chemical experiments and other empirical observations concerning the given substance. The goal was to create a sort of publicly accessible database of the chemical analysis of every known substance. Boyle prioritized substances such as the traditional Aristotelian elements and Paracelsian chemical principles, “noble” metals like gold, and bodily fluids such a blood, due to their potential medical value. Concerning salt, a basic chemical principle according to the Paracelsians, Boyle claimed to be able to distinguish three different kinds, each of which he could chemically produce.

Boyle believed colors were caused by the mechanical properties of material corpuscles. In works such as Experiments and Considerations Touching Colours (1664) and New Experiments Concerning the Relation between Light and Air (1668), Boyle presents a chemical analysis of colors and light. He also analyzed samples of phosphorous he had acquired, which produce light chemically. Boyle achieved significant success in these endeavors, though this pales in comparison to the success of later philosophers on the nature of color. This line of investigation also led Boyle to discover things not directly related to color, such as a reliable method of distinguishing an acid from a base.

Developing an interpretation of a laboratory accident of Hennig Brandt, in 1680 Boyle saturated some coarse paper in phosphorous and drew a stick coated with sulfur across it, creating a steady flame. This was the first friction match. The creation of a reliable and eventually safe way to easily produce fire was a major technological advancement that changed the world.

Boyle spent the last twenty years of his life engaged, often with the help of Ranelagh, in the chemical analysis of medical recipes. These efforts did much to bring chemistry out of the shadows of alchemy and into the light of social respectability. Throughout his work in chemistry, Boyle advocated openness in the publication of experimental results, including even those experiments that were unsuccessful. Nonetheless, there were exceptions to this openness involving alchemy.

d. Alchemy

Many of the early modern philosophers, most notably Isaac Newton, had a significant interest in alchemy, and Boyle was no exception. Lawrence Principe in The Aspiring Adept: Robert Boyle and his Alchemical Quest (1998), and William Newman and Lawrence Principe in Alchemy Tried in the Fire: Starkey, Boyle, and the Fate of Helmontian Chymistry (2002), present a detailed analysis of Boyle’s alchemical pursuits, though one should also read Hunter’s account. The early Boyle scholars Henry Miles and Thomas Birch actually destroyed much of Boyle’s work in alchemy, fearing it would tarnish his reputation as a scientist. During his lifetime, however, Boyle’s interest in alchemy was extensive and well known. Though Boyle often tried to distance chemistry from its alchemical association, many of his projects in natural philosophy were clearly alchemical.

Boyle’s alchemical endeavors were motivated by three goals: to uncover the hidden nature of physical reality, to find “extraordinary and noble medicines,” and to acquire accurate accounts of supernatural events that might help convince religious skeptics. Boyle expressed an interest in finding the philosopher’s stone as early as 1646, though he mentions it more as a humorous exaggeration than a current project. In a letter to Ranelagh in May of that year, he complains that he is not destined to find the philosopher’s stone, since his initial attempts at chemical analysis had been so unsuccessful.

Boyle believed it was possible to transmute one substance into another, and this included the traditional alchemical quest of turning lead into gold. He believed the possibility of transmutation directly followed from the mechanical philosophy. If there is only one universal type of matter, and the differences between the macroscopic substances we perceive are the result of structural differences at the microscopic level, then it follows that causing changes in the structure and arrangement of corpuscles might cause substantial changes at the macroscopic level. Since gold and lead have similar macroscopic properties, there might be only a subtle difference between them at the microscopic level.

Boyle claimed to have witnessed the transmutation of lead into gold on more than one occasion. As early as 1652 he claimed to have acquired a quantity of philosopher’s mercury, a substance believed to be required for gold transmutation. Boyle also claimed to have turned gold into a base metal, using a powder given to him by a mysterious stranger. In some works, Boyle describes successful transmutation experiments on other substances. Boyle spent a great deal of time, effort, and financial resources in these pursuits, which included searching for the elixir of life, a medicine capable of curing all diseases and extending the human lifespan.

Boyle was hoodwinked on more than one occasion by charlatans who claimed to have alchemical knowledge or the rare substances required for his alchemical pursuits. The most notable incident involved a con man named Georges Pierre. Boyle eventually realized he was being had, and there is evidence that he was aware of the danger of such scams and viewed them as an unfortunate but necessary risk in the pursuit of alchemical knowledge, a risk that his unique wealth allowed him to take.

Influenced by Bacon’s utopian conception of science, Boyle thought scientific information, including his own detailed reports of chemical experiments, should be made public for the benefit of humanity. This allowed his experiments to be reproduced and the knowledge acquired to be used to help people, especially in areas such as medicine, where the benefit to the public was obvious and immediate. There were limits to this support of scientific openness, though. For example, Boyle was concerned that the publication of instructions for turning lead into gold could collapse the world economy, bringing social chaos. Upon Boyle’s death, Newton, also a dedicated alchemist, made attempts to obtain Boyle’s alchemical notes regarding the transmutation of lead to gold. Boyle had anticipated this and left detailed instructions in his will to prevent it.

Furthermore, despite Boyle’s support of scientific openness as well as his aversion to taking oaths, Boyle often employed secrecy in his alchemical pursuits. The sort of secrecy involved here, however, was a part of the cost of networking with other alchemists to share recipes and other experimental data, and was considered common practice in the world of alchemy. Most alchemists were secretive, and would exchange recipes and materials only if their secrets were kept. Boyle was justified in believing that, if he had refused to make such promises, other alchemists would not have shared their work with him. Nevertheless, this is a notable exception to his otherwise deep aversion to taking oaths, as well as his Baconian belief that scientific data should be open to the public for the benefit of humanity.

e. Medicine

Boyle also had a deep interest in medicine. Though he never formally studied it, much of his research in natural philosophy was either directly medical in nature or motivated by medical goals, both practical and theoretical. He nonetheless distrusted physicians, after an event in his youth in which he became gravely ill when a physician at Eton gave him the wrong medicine by mistake. Furthermore, he generally rejected their Galen-based theories in favor of mechanical ones. He noted that chemical remedies often worked better than the Galenist practice of bloodletting, and that many of Galen’s views were based on claims about human anatomy that turned out to be incorrect. Boyle thought most patients were better off not seeking a doctor’s treatment.

At the same time, Boyle knew, respected, and was respected by many of the leading physicians of his day. Boyle’s London neighbor was Thomas Sydenham (1624-1689), one of the greatest physicians of the day. Sydenham read Boyle’s work and liked it so much that he dedicated his own book, Methodus Curandi Febris (1666), to him. He sometimes even asked Boyle and Ranelagh to accompany him on house calls. Boyle’s medical work was so respected that Oxford gave him an honorary Doctorate of Medicine, the only degree he ever received.

Boyle’s work in medicine is entwined with his work in natural philosophy. While the two should not be conflated, as Boyle worked on many nonmedical projects in natural philosophy, neither can be fully understood apart from the other. The development of Boyle’s interest in medicine coincided with his interest in natural philosophy in general, beginning around 1646, increasing in the mid-1650s, and lasting the rest of his life.

One of Boyle’s earliest published works was a collection of medical recipes entitled An Invitation to a Free and Generous Communication of Secrets and Receits in Physick (1655). Though Boyle worked on medical projects throughout his scientific career, a renewed interest in medicine began in the late 1660s. He would go on to steadily publish books on medical topics for the rest of his life, including Memoirs for the Natural History of Human Blood (1684), Of the Reconcileableness of Specifick Medicines to the Corpuscular Philosophy (1685), Some Receipts of Medicines (1688), Medicina Hydrostatica (1690), Experimenta et Observationes Physicae (1691), and Medical Experiments (1692).

Boyle worked with Locke on a few medical projects that are worth noting. Though early 21st century scholars remember Locke primarily for his work in epistemology and political philosophy, he considered himself first and foremost a physician. Boyle and Locke collaborated for several years to create a Baconian experimental history of human blood. This was part of a larger project of Boyle’s to create records of experimental observations regarding every known substance, with priority given to substances, such as blood, with potential value to medicine. Their work was interrupted while Locke was travelling or Boyle was ill, but their persistence resulted in the publication of Memoirs for the Natural History of Human Blood (1684).

A second medical project with Locke was the collection of data for testing the miasma theory of disease. This is particularly noteworthy because this theory proposes the mechanical explanation that disease is caused by noxious vapors moving in the air. The theory holds that these vapors act as a contagion, penetrating the bodies of those who come in contact with them through respiration. Boyle believed the contagions were composed of corpuscles and might originate deep underground, being released by human activity such as mining. Boyle and Locke hypothesized that these noxious corpuscular emanations were then spread far and wide by the wind. Believing disease and weather were linked, they collected data from physicians across the country on both the weather and the patients they had treated, looking for correlations. While this was a relatively minor project compared to some of Boyle’s other achievements, it is noteworthy since it attempted to use empirical data to test a mechanical explanation. One should not conflate the mechanical philosophy with the experimental philosophy, but the points where they intersect provide insight into Boyle’s philosophy.

Another medical collaboration in which Boyle participated was the race to find a cure for the Great Plague of 1666, an epidemic of bubonic plague which killed a fourth of London’s population, including Boyle’s former mentor George Starkey. Boyle’s belief in the miasma theory convinced him to leave London during this time. Despite this, Boyle was still part of a general effort to cure the plague that included Ranelagh, Sydenham, Locke, and many others. Boyle’s particular efforts primarily consisted of developing medical recipes he hoped would be useful to plague victims, which he then sent to Henry Oldenburg (1619-1677).

Boyle spent the last twenty years of his life engaged in medical research with his sister, Katherine Ranelagh. Through their vast network of correspondents, they would find medical recipes which they would then chemically analyze. Through medical research, Boyle found the clearest way to wed his passion for natural philosophy with his philanthropic goals.

Although he sometimes exaggerated his poor health, Boyle also suffered from very real and serious ailments including malaria, edema, seizures, kidney stones, toothaches, and deteriorating eyesight. He also suffered throughout his life from melancholy and complained of imaginative fits he described as “ravings.” During these episodes, he was carried away by his imagination, making it difficult to work. Boyle considered these ravings both a medical condition and a moral defect and spent years seeking a remedy. Since Boyle distrusted doctors and was an expert chemist, he often treated these illnesses with his own concoctions, sometimes making his condition worse. In 1670, Boyle suffered a severe stroke that left him partially paralyzed. He eventually recovered most of the mobility he had lost and continued working on his experiments.

f. Pneumatics

 In 1643, Evangelista Torricelli, a friend and advocate of Galileo, filled a glass tube with mercury, turned it upside down, and placed it in a basin of mercury. The level of mercury in the tube lowered, but some mercury remained in the tube, suspended by the weight of the air—the air pressure—pressing down on the surface of the mercury in the basin. Since the tube was airtight, Torricelli reasoned that the area in the tube above the mercury must be a vacuum. Through Marin Mersenne and his vast correspondence network, news of the experiment quickly spread throughout Europe.

Otto von Guericke (1602-1686) heard of the Torricelli experiment and designed a pump capable of producing an evacuated receiver so strong, due to the outward air pressure, that sixteen horses could not pull the two hemispheres of the receiver apart. Boyle had been interested in the nature of respiration for some time, so when he and Hooke, then Boyle’s laboratory assistant, heard of von Guericke’s impressive feat, they set about to create their own air pump. Boyle designed an improved model which featured a chamber made of glass, allowing direct observation of the phenomena within the evacuated receiver. Boyle first approached the scientific instrument-maker Ralph Greatorex (1625-1675) to build it, but when he failed Hooke took up the difficult challenge and succeeded.

From the spring through the fall of 1659, Boyle and Hooke performed dozens of experiments using the air pump and published the results in New Experiments Physico-Mechanical Touching the Spring of the Air and its Effects (1660). In this book, Boyle provides extremely detailed presentations of 43 of the experiments, giving compelling evidence for such claims as that air is a distinct substance from space, that air is elastic and has a spring, and that air pressure is so powerful that a glass vial of water placed in the receiver explodes when the air is removed. They demonstrated that air is required for phenomena such as combustion, respiration, and sound. They even placed a Torricellian barometer in the receiver, showing that the mercury does not remain suspended in the vacuum. Spring of the Air established Boyle’s scientific reputation. With its success, Boyle went from being an amateur gentleman interested in natural philosophy to being the leading scientist of the day.

The book highlighted Boyle’s genius for developing experiments that revealed important scientific information, and he also included detailed critiques of the other theories he had studied concerning the nature of air. The detail of his analysis astounded even other natural philosophers such as Henry Power (1623-1668), who claimed, “I never read any tract in all my life, wherein all things are so curiously and critically handled, the experiments so judiciously, and accurately tried, and so candidly and intelligently delivered.” It also influenced Newton, who saw it as a paradigm of scientific research.

At many of the early meetings of the Royal Society, Boyle was asked to replicate some of the experiments. Unlike other natural philosophers, Boyle had the financial resources to conduct the experiments and to repair the temperamental air pump when it broke. He even had an additional air pump made at considerable expense, which he gave to the Royal Society on May 15, 1661.

The book was also controversial, and it remains so to this day. Steven Shapin and Simon Schaffer explore the social construction of science, using the controversy between Hobbes and Boyle over the air pump experiments as their focal point in their influential book Leviathan and the Air Pump: Hobbes, Boyle and the Experimental Life (1985). However, one should also read Hunter’s account. The Jesuit Priest Francis Linus (1595-1675) tried to replicate some of the experiments and offered an alternative Aristotelian interpretation of the results, defending the view that nature abhorred a vacuum in Treatise on the Inseparable Nature of Bodies (1661). Christiaan Huygens (1629-1695) also reported that he could not replicate some of the experiments. Boyle praised Linus for his use of experiment, but pointed out the defects in his experimental practice in A Defense of the Doctrine Touching the Spring and Weight of the Air (1662). He added that further experiments with a J-shaped tube corroborated his claim that the reciprocal proportion between the pressure and volume of air was constant. This became known as Boyle’s Law.

This is controversial because Boyle appealed to experiments with the J tube actually performed by other natural philosophers like Henry Power and Richard Towneley (1629-1704). Furthermore, it was Hooke, rather than Boyle, who worked to find the precise numerical relation between air volume and pressure, while Boyle was more interested in the philosophical significance of the proportion being reciprocal and constant.

Even more significant was a series of objections raised by Boyle’s fellow mechanical philosopher Thomas Hobbes, upon which Leviathan and the Air Pump focuses. Hobbes offered a contrary mechanical interpretation that was consistent with observation. Like Descartes’s interpretation of the Torricelli experiment, Hobbes suggested that subtle matter was passing through microscopic pores in the glass so that the receiver was full of matter and not a true vacuum. Since it is possible to give an alternative mechanical explanation consistent with observation, Hobbes argued one cannot use experiments to decide between them. Furthermore, since multiple mechanical interpretations are possible for any experimental observation, observations are never completely independent of theory.

In An Examen of Mr. T. Hobbes his Dialogus Physicus De Natura Aeris (1662), Boyle replied by distinguishing between “matters of fact,” which can be tested, and mere “hypotheses,” which result from metaphysical speculation. It is possible that subtle matter penetrated the glass, but until there is empirical evidence to support this, positing the existence of subtle matter violates Ockham’s razor. Notably, by the early 21st century compelling evidence had emerged that an evacuated receiver contains billions of subatomic particles, such as neutrinos, far smaller than the pores of the glass.

Boyle also was motivated by a desire to show a theistic alternative to the equally mechanical materialism of Hobbes, Gassendi, and the ancient atomists, which was then strongly associated with atheism. For a time, Hobbes’s name was almost synonymous with atheism. Boyle had tried to show, since the early 1650s, that a mechanical philosophy could be compatible with Christianity.

In the end, Boyle wrote some ten books concerning his work with the air pump: New Experiments Physico-Mechanical Touching the Spring of the Air and its Effects (1660); A Defense of the Doctrine Touching the Spring and Weight of the Air (1662); An Examen of Mr. T. Hobbes his Dialogus Physicus De Natura Aeris (1662); New Experiments Concerning the Relation between Light and Air (1668); A Continuation of New Experiments Physico-Mechanical Touching the Spring and Weight of the Air and their Effects (1669); New Pneumatical Experiments about Respiration (1670); Of a Discovery of the Admirable Rarefaction of Air (1670); Flame and Air (1672); A Continuation of New Experiments Physico-Mechanical Touching the Spring and Weight of the Air and their Effects (1680); and The General History of Air (1692). Eventually, though, his attention shifted to medical chemistry.

3. Philosophy of Science

Boyle was well known for his views on the role of experimental evidence in natural philosophy. Boyle’s philosophy of science was primarily influenced by Bacon. In Novum Organum (1620) and New Atlantis (1627), Bacon had challenged natural philosophers to employ an inductive scientific method based on the careful application of technology to make detailed empirical observations, instead of relying on the syllogistic approach favored in the Scholastic tradition, which made deductive inferences from universal principles. Bacon argued that if the universal principles themselves turned out to be false, the conclusions deduced from them would be unjustified. Instead of trying to anticipate what nature should be like according to reason, natural philosophers instead should make detailed observations of what nature is actually like. They should then interpret these observations and form inductive generalizations about the natural world. This approach to science allows observational evidence to have epistemic priority over theory, so that theories can be modified in the face of new empirical evidence. Bacon envisioned a future “history of qualities,” a sort of publicly accessible scientific database of empirical observations.

Boyle took this challenge seriously and developed an experimental method that used detailed observation, aided by new technology, to reveal nature’s hidden structure. This approach is apparent in his work in pneumatics, his chemical research to create experimental histories of substances, and his projects on cold, air, light, color, minerals, and gems. Many of these projects never came to fruition, but on some he worked steadily for years. For instance, Boyle’s natural history of Ireland never even got off the ground, but his empirical approach to the study of blood was fruitful and eventually led to medical advances which now routinely save lives. It is also important to note that this collection of empirical data is not the blind data collection of the “narrow inductivist conception of scientific inquiry” criticized by Carl Hempel in Philosophy of Natural Science (1966). Boyle prioritized the experimental investigation of substances with obvious benefit to society, and Boyle’s empirical data collection was hypothesis driven.

Boyle’s commitment to the mechanical philosophy was consistent with his views on the role of experiment in science. Boyle would often develop mechanical explanations of phenomena that served as hypotheses, for which he would then design experiments to test. He thought that testability was important in hypothesis development as well as in determining what questions science should pursue. He had a genuine talent for creating experiments designed to test theories, and in many cases this provided new scientific information. Following Bacon, Boyle tried to resist non-empirical metaphysical speculation and modify theories in the light of new experimental evidence. The results are mixed, but when he did engage in metaphysical speculation, such as in his treatment of the arguments for body-to-body occasionalism, he prefaced his remarks by noting that none of the theories he discussed could be empirically tested.

Comparison with Descartes on the role of experiment in natural philosophy is insightful. Experimental observation played a much different role for Boyle than it did for Descartes. Descartes is famous for conducting ingenious experiments, but rather than being used to test or falsify a hypothesis, they often played a part in the reduction of a complex scientific question into more basic ones. In Rules for the Direction of Mind (1628) and Discourse on the Method (1637), Descartes describes a scientific method that involves reducing a problem into more and more fundamental problems until a problem is reached that is so basic that a self-evident intuition solves it. One can then use this intuitive solution in a series of deductive inferences, solving the problems until one reaches a solution to the original one.

Furthermore, for Descartes, empirical observation was not a reliable method of testing hypotheses, since he believed the senses provide only confused modes of thought. The only properties of matter about which we can be certain, for Descartes, are the geometric properties of extended space. He believed this method of science could achieve the same level of certainty as mathematics since it restricted itself to clear and distinct deductions from matter’s geometric properties. For Descartes, physics is applied geometry.

By contrast, Boyle thought theory must be epistemically subordinate to observation, so he used experiments to test a theory. Instead of using them in a reductive process of finding self-evident intuitions, he designed experiments specifically to falsify or corroborate a claim. In this way, claims such as “air is needed for respiration” could be empirically supported, while claims such as “air is identical to space,” could be refuted. For Boyle, scientific knowledge was more likely to be inductively inferred than geometrically deduced.

Concerning Boyle’s general epistemology, in works such as A Discourse of Things above Reason (1681), Boyle distinguishes between things that can be known by reason and things that can be known through experience. Boyle also believed that at least some ideas are innate. Examples of innate ideas include the belief that contradictories cannot both be true, that the whole is greater than the part, and that every natural number is either odd or even.

Furthermore, Boyle believed that some truths are beyond a human’s capacity to understand. These are things which are true, and our intellect has sufficient cause to assent to them based on experience, authentic testimony, or mathematical demonstration, but when it reflects on them, it finds itself at a strange disadvantage. Boyle includes three kinds of beliefs in his taxonomy of things above reason.

The first kind he labels “incomprehensible” since it includes belief in things beyond our comprehension. For example, our finite minds cannot grasp the infinite nature of God. Boyle thinks we can comprehend that God exists and some of the things that God is not, but we cannot fully understand the boundless nature of his perfections. Boyle declares this to be truly supra-intellectual.

Boyle calls the second kind of thing above reason “inexplicable.” This includes beliefs for which we are unable to conceive of their manner of existing, or how the predicate can be applied to the subject. Boyle gives examples such as the infinite divisibility of matter and the incommensurability of the diagonal of a square to the length of its sides.

Boyle calls the final kind of thing above reason “unsociable,” but it might better be labeled “incompatible.” This class includes true propositions that seem incompatible with other propositions known to be true. For example, human free will seems to be incompatible with God’s foreknowledge of future events, but necessary for moral responsibility. Mind and body are distinct substances, but they seem to causally interact. Boyle thought these were real problems and had real solutions but were likely beyond a human’s finite capacity to understand, though he also thought philosophers should continue to try.

Like Descartes, Boyle believed that we could have knowledge of things that are beyond our capacity to clearly imagine, such as the mathematical properties of a chiliagon. We can demonstrate necessary truths about a 1000-sided object and show it has different properties that a 1001-sided object. Despite this, the images our minds form of these shapes are indistinguishable.

Boyle also distinguished between real and nominal essences, which, along with his work on primary and secondary qualities, influenced Locke’s epistemology. In A Free Enquiry into the Vulgarly Received Notion of Nature (1686), Boyle begins by listing all the ways the term “nature” is used. He then distinguishes between the “notional” sense, which is the way we choose to use words, from the way nature really is. Boyle also discusses the distinction in the Origin of Forms and Qualities (1666).

4. Substance Dualism

Boyle was a substance dualist, postulating that the universe consists of two types of substance: purely material corpuscles and nonphysical, conscious souls. Boyle accepted Descartes’s definition of substance as a type of entity that was not ontologically dependent on anything but God, whereas a mode is ontologically dependent on a substance. Shape, for example, cannot exist on its own, but is ontologically dependent on the bit of matter that has it.

Boyle’s dualism was influenced by Descartes, especially after his work with Robert Hooke, who taught him Cartesian philosophy, but there are important differences between their similar metaphysics. Descartes held that spatial extension was the “Attribute,” or essence, of matter, while thought was the essence of mind. Accordingly, all true properties of matter were modifications of extension, such as size, shape, and motion. In a similar way, since thought is the essential attribute of mental substance, all properties of mind are modes or types of thought.

Although Boyle agreed that thought was mental and matter was extended, he was not committed to Descartes’s elegant, rationally deduced substance-attribute-mode model. The mechanical affections Boyle associated with matter were derived from experience. For example, Boyle included solidity as another empirically based mechanical affection, but it is not clear how one can explain it as a mode of Cartesian spatially extended matter.

Boyle saw that bodies need some minimal force of resistance for mechanical interaction to be possible, though he emphasized such a force was nothing like a rational disposition or internal source of motion. Boyle also believed God gave matter the power to transfer motion upon collision, another potential problem for Descartes, since modes should not be able to transfer.

Likewise, for Descartes, the existence of a void or vacuum in space—that is, an area of space containing no matter whatsoever—is logically impossible. Since the attribute of body is extension, and there is no real distinction between a substance and its attribute, any extended area of space must contain body. Boyle’s views on the nature of the material world were more influenced by Bacon and Gassendi. He believed the elegance of a metaphysical system is not as important as its correspondence to empirical observation. He thought the air pump experiments supported the idea that a vacuum in space, devoid of all matter, was logically possible, and the existence of a vacuum should be posited until there was empirical evidence for the presence of matter in the evacuated receiver.

A final difference between Boyle’s dualism and that of Descartes was Boyle’s belief in animal consciousness. Descartes thought animals lacked a soul and were merely incredibly complex, divinely designed machines. Although they behaved as if they suffered, nonhuman animals lacked any conscious mental states. Descartes performed many animal dissections, including vivisections of live animals. Boyle saw the scientific need for vivisection since some anatomical features are only observable in living bodies. He even performed some during his sojourn in Ireland during the early 1650s. He gave up the practice, though, because of the observable suffering it caused. Boyle even had a preference for free-range chicken, but this may have been as much about flavor as chicken flourishing.

Boyle believed much instinctual behavior in nonhuman animals is purely mechanical, such as involuntary blinking when an eyelash is touched by a feather. Although he believed nonhuman animals were capable of conscious sensations, he thought they lacked rationality. Like other natural phenomena, nonhuman animal behavior sometimes seems rational, but, contrary to the scholastic Aristotelians, he thought the material world contained no rational dispositions.

5. Causation

Fundamental to Boyle’s philosophy is the belief that matter is passive, having no internal power, force, source of motion, or substantial form beyond the primary qualities of size, shape, solidity, and motion. He rejected the scholastic tendency to see intelligent dispositions everywhere in nature, such as the view that nature abhors a vacuum, or the view that an element has an internal disposition to move toward a natural location in the universe. Boyle acknowledged that the regularity seen in the natural world makes it sometimes seem like there is rational behavior, such as the regular motion of celestial bodies, or the tendencies of chemical substances to repeatedly behave in uniform ways. Despite this, he rejected the view that matter had power beyond its mechanical properties and sought to demonstrate how natural phenomena could be explained in terms of the motion of particles obeying certain laws of motion which he believed God had established. In works such as The Christian Virtuoso (1744), Boyle argued that the regularities we see in nature are a manifestation of God’s power and that divine volitions cause the laws of nature.

Boyle believed that the ultimate cause of motion is God, who created bodies, set them in motion, and maintained the laws of motion by divine will. God does grant matter certain basic powers such as solidity and the power to transfer motion to other bodies upon collision, but these are to be understood as unconscious mechanical properties rather than anything like mental dispositions or the internal sources of motion invoked by scholastic Aristotelian natural philosophy.

Boyle was aware of, and even sympathetic to, occasionalism, the view that God is the cause of anything that requires a cause. However, he never explicitly endorsed it. He does speak of it favorably in folios 38 to 40 of volume 10 of the Boyle Papers.  While not explicitly endorsing it, Boyle presents three arguments intended to show that body-to-body occasionalism is not in itself absurd. Boyle does not here discuss mind-body occasionalism, but rather how God causally interacts with matter to create the natural world.

This is a minor discussion in his vast corpus, and should not be given undue emphasis. Its relevance to Boyle’s views on causation, though, makes it worthy of inclusion here. Boyle generally tried to avoid non-empirical metaphysical speculation or metaphysical system building, and he begins by pointing out that the issue cannot be settled by any testable experiment. Boyle then explicitly appeals to Ockham’s razor. Since God’s concurrence by itself is sufficient to cause the motion of bodies, it is superfluous, and even potentially impious, to attribute such power to finite bodies. If God wills a body to be in location a, and later wills it to be in location b, this alone is sufficient to move it. Attribution of a second cause to matter itself is not necessary.

Boyle’s second argument anticipates the philosophy of David Hume (1711-1776) by claiming that causation itself never appears to the senses. The power of one body to move another body is not directly observable. We only perceive that when one body hits another there follows a motion in the second body. This point is essential to Hume’s formulation of the problem of induction, supporting the claim that our belief in causation cannot be justified as a matter of fact. For Boyle, the fact that the power of causation is not manifest to the senses shows that it could be God. Therefore, occasionalism cannot be ruled out as absurd.

Boyle’s third argument is that it might not be even possible to conceive of one body communicating motion to another. If finite bodies are collections of modes ontologically dependent on the attribute of extension, for example, they should not be able to cause motion in another body. It thus should not be possible for us to conceive of a body transferring its motion to another body on collision. Occasionalism, therefore, cannot be ruled out as absurd since it actually seems more comprehensible than attributing the power of causation to finite bodies.

Boyle incorrectly labels Descartes as a sort of deist. Deists believed that, after the initial divine causal impulse, the universe ran on its own accord, obeying the laws of motion without the constant intervention of God. However, Descartes believed that God is constantly involved in creating the world through one continuous divine act. Boyle was aware of the similar body-body occasionalism of Louis De La Forge, in which God creates motion by recreating an object in different locations at different times. Boyle, however, seems to have preferred what Peter Anstey has described as “nomic occasionalism.” According to this type of body-body occasionalism, bodies are not totally passive but have basic, mechanical powers, such as solidity and the power to transfer their motion to other bodies upon collision. On this view, God causes the initial motion, preserves and conserves that motion, and determines the direction and speed of bodily motions before and after collisions. Like many of his contemporaries, Boyle believed that the laws of nature are divine volitions. In the case of miracles, though, God can suspend a law of nature, a further manifestation of divine power. Yet again, Boyle was cautious and hesitant to proclaim nomic occasionalism over deism, or the so-called cinematic occasionalism of De La Forge, pointing out that none of these views can be easily empirically tested.

In any case, it seems clear that Boyle’s occasionalism was confined to body to body interaction. Boyle thought that human minds were capable of genuine causal agency. This agency played an essential role in his views on the nature of moral responsibility, as well as his theological views about what is necessary for salvation. Our souls are connected to our bodies and somehow causally interact with them. Here again, Boyle is hesitant to commit himself to any specific theory beyond what can be experimentally tested. He believed that how mind-body interaction is possible, as well as how free will is consistent with divine foreknowledge, are likely mysteries beyond the ability of reason to solve.

6. God

By now it should be clear that the single most important influence on Boyle’s philosophy was his personal religious beliefs. His contributions to philosophy, chemistry, pneumatics, and medicine can be all interpreted as the development and fulfillment of a lifelong religious quest. Boyle thought there were three true books of wisdom, the “book of scripture,” the “book of nature,” and the “book of conscience.” He thought all three were important and spent nearly equal amounts of time and energy on each.

Boyle was christened at the chapel at Lismore Castle in Ireland as an infant and brought up as an Anglican protestant, though he was greatly influenced by Puritanism. The terrible storm Boyle witnessed on his grand tour with Isaac Marcombes was a transformative experience for Boyle, and many of his philosophical projects can be seen as attempts to fulfill the oath he took to survive it.

Boyle thought that, of the traditional arguments for the existence of God, the teleological argument was the strongest. Boyle acknowledged that the existence of God could not be rationally demonstrated, but he believed the natural world abounded with empirical evidence of God’s power and wisdom. He thought the incredible complexity and order of the universe was evidence of God’s existence. The vastness of the universe, and the speed with which the earth and celestial objects move, Boyle saw as evidence of God’s unbounded power. He thought that God’s constant concurrence was needed to sustain the universe’s existence.

He was particularly amazed by the human body and the bodies of nonhuman animals, which he interpreted as divinely constructed machines. Internal organs were smaller machines ingeniously and exquisitely designed to work together to sustain the life of the animal. Ignorant of natural selection, Boyle thought the incredible complexity of their mechanical structure was compelling evidence of God’s existence. In one early letter, Boyle claimed to have learned more about God’s creation dissecting fishes than in all the books he had read. At a macroscopic level, he thought that the climates of the different regions of the earth, and other geological features were intentionally designed to sustain the lives of various animals.

Boyle also used the famous clock at Stroudsburg as an analog to “this great automaton the world.” He thought the universe itself was intentionally designed by God to be understood by rational creatures, though parts of this creation are beyond human comprehension. Boyle believed that, since the universe was a manifestation of God’s greatness, one should study the book of nature as an aid to salvation.

Boyle also had a basic modal semantics. He believed God has the power to create alternative universes with different laws of nature. Boyle interpreted these possible worlds as potential divine creations. In addition to possible alternative creations of God, in Of the High Veneration Man’s Intellect Owes to God (1685), Boyle claims the size of the actual universe is so great that distant regions of space might have other areas, the size of our observable universe, that contain different planets and creatures, and even might have different laws of nature.

In the traditional theological debate between divine voluntarism, which holds that God’s will is prior to his reason, and divine intellectualism, which holds that God’s reason is prior to his will, Boyle has been often regarded as an important early modern voluntarist, but the label needs qualification. Boyle believed it was rash to claim that God’s acts had to conform to our finite conception of reason, and he generally rejected the a priori approach to theology advanced by many intellectualists. There is no way for us to deduce a priori which of the countless possible worlds God chose to create. Boyle thought we could learn about God’s magnificent creation through empirical observation. The problem with placing God’s reason above his will was that we are limited by our finite understanding of a priori truths.  The ultimate contingency of the laws of nature calls for their empirical investigation, rather than a priori deduction. On the other hand, Boyle did not think God did things arbitrarily. He thought everything happened according to God’s divine plan, even if we could not completely understand it. Boyle’s rejection of intellectualism has more to do with the limits of our finite reason than a priority of God’s will over his reason.

Boyle believed everyone had the capacity for salvation. Boyle, Ranelagh, and other members of the Hartlib Circle collaborated on a number of projects to make the Bible available to more people, including overseeing the publication of translations of the Bible into Irish, Malay, and Algonquin. This has allowed much of the Algonquin language to be preserved. Such projects were controversial at the time, but Boyle saw them as part of his religious duty.

Boyle spent years mastering ancient Biblical languages to further his understanding of the Bible, including Greek, Syrian, Aramaic, and Arabic. He learned Hebrew to read the Torah and sought out Jewish scholars for advice on his translations. He argued for religious toleration, though he thought Christianity held the only path to salvation.

Boyle believed in the existence of supernatural creatures such as angels, demons, and witches. In Of the High Veneration Man’s Intellect Owes to God (1685), he claimed that angels, both good and evil, are rational but completely incorporeal, and that there could be as many species of angels and demons as there are nonhuman animals, with subtle moral differences between them. On the other hand, he also believed that most witch trials were unjust and not cases of real witchcraft. He tried to apply his empirical scientific method to the investigation of supernatural phenomena by creating a sort of database of reliable accounts of supernatural events, just as his Baconian histories of qualities were records of reliable experimental observations of natural substances. Boyle was convinced that enough reliable accounts of supernatural phenomena would make skepticism of Christianity seem unreasonable. He even saw to the publication of what he believed to be a true account of a poltergeist: Pearreaud’s Devil of Mascon (1658). He also tried to investigate what he thought to be a reliable account of precognition.

Despite a lifetime of religious pursuits, Boyle also had significant religious doubts. These doubts troubled him, and throughout his life he sought spiritual guidance from friends, family, and clergy. He worried that his wealth had been taken from Ireland unjustly and that his philanthropic endeavors were inadequate. He also feared that he had committed a sin against the Holy Ghost by ignoring opportunities to repent for self-acknowledged sins.

Boyle intended to write a book about atheism, but it was never completed. He left a substantial endowment in his will to start a series of annual lectures defending the existence of God and the basic tenets of Christianity against the dangers of atheism he perceived. The sermons started in 1692 and lasted steadily until 1935, after which time they were given frequently, but sporadically. Since 2005, they have been given every year once again.

7. Ethics

Although Boyle is best known for his scientific endeavors, he was also fundamentally concerned with ethics. His earliest attempts at philosophy were in ethics, and ethics dominated his philosophy throughout the years he spent at his estate in Stalbridge during the 1640s, following his return to England from the grand tour with Isaac Marcombes. At some point during the late 1640s to early 1650s, Boyle had a conversion experience in which the focus of his work shifted permanently to natural philosophy. Nonetheless, he never abandoned his ethical concerns.

His most extensive ethical work is the Aretology, a systematic study of virtue. Written between 1645 and 1647 and never published during his lifetime, the treatise defends the claim that the key to human flourishing is the attainment of “felicity,” which Boyle understood as a supreme, sufficient, contenting happiness, ultimately achievable only after the death of the body and the contact of the soul with the divine. Felicity is the goal of eudaimonia because Boyle believes it is the only thing that is good in itself. Boyle rejects pleasure, honor, wealth, and even knowledge as approaches to achieving felicity, arguing instead that “to the palace of felicity the only highway is virtue.” This warrants the systematic study of moral virtue to which the title refers.

Boyle begins by claiming that the proper subject of moral virtue must be the rational soul rather than the affections of the senses. He then adopts a basically Aristotelian causal analysis of moral virtue, complimented with dashes of stoicism. Thus, the final cause of virtue is felicity, as we have seen. The material cause of virtue is the human soul. The formal cause of virtue is what Boyle terms “mediocrity,” the Aristotelian idea that a moral virtue is a mean between a vice of deficiency and a vice of excess, which one obtains only through habitual repetition until it becomes part of one’s character. The efficient cause of virtue is the most complex. Boyle sees it as a combination of God, the capacity that God gave us to develop virtue, mental habit, and living in accordance with right reason.

Boyle was greatly influenced by stoicism, having read the classic works under Isaac Marcombes. This influence is apparent throughout his moral treatises. Boyle’s ethics was also heavily influenced by Johann Alsted (1588-1638), a German Calvinist.

8. Casuistry

Boyle was a dedicated casuist, believing that a detailed analysis of his own conscience was just as important as the study of nature or the study of the Bible, and he devoted just as much of his time and effort to it. Boyle was just as meticulous in the analysis of his own conscience as he was at chemical analysis, scrutinizing his behavior, taking detailed notes, discussing them regularly with close friends and spiritual advisors such as Ranelagh, Locke, Gilbert Burnet (1642-1715), and Edward Stillingfleet (1635-1699).

Boyle’s intense examination of his own conscience likely goes back to the conversion experience he had during the night of the terrible storm on his grand tour, but it was probably also influenced by his study of stoicism. Boyle even provided a stipend for Robert Sanderson to help him publish his Lectures on Human Conscience, a book based on a series of lectures that Sanderson gave at Oxford in the 1640s. It is considered a classic in the field of casuistry.

Throughout his life, Boyle also suffered from manic fits he described as “ravings,” in which his imagination seemed to run away beyond his control, ravishing his attention. He found these fits of restless fancy disturbing and debilitating, and he made all sorts of efforts to treat these episodes both medically and by developing coping mechanisms to calm himself when the fits occurred.

Boyle scrutinized his daily moral behavior. For example, Boyle sometimes had to make promises of secrecy to obtain new alchemical recipes. This not only involved taking an oath, but also ran counter to his general advocation of openness in experimental data. These sorts of tensions gave Boyle and his spiritual advisors plenty of material to analyze. A full understanding of Boyle’s thought has to appreciate his equal dedication to the study of the book of nature, the book of scripture, and the book of conscience.

9. References and Further Reading

a. Recent Editions of Boyle’s Works

  • The Works of Robert Boyle (Pickering & Chatto, 1999-2000), ed. Michael Hunter and Edward B. Davis.
    • This fourteen-volume set is the definitive edition of Boyle’s work.
  • Selected Philosophical Papers of Robert Boyle (Hackett, 1991), ed. M.A. Stewart.
    • An excellent paperback edition of some of Boyle’s most important works.
  • A Free Enquiry into the Vulgarly Received Notion of Nature (Cambridge, 1996), ed. Edward B. Davis and Michael Hunter.
    • A paperback edition of this important later work by Boyle, with a good introduction and chronology.
  • The Works of the Honourable Robert Boyle (Rivington, 1772), ed. Thomas Birch.
    • This was the classic edition, but has been surpassed by the Hunter and Davis edition.

b. Chronological List of Boyle’s Publications

  • An Invitation to a free and generous Communication of Secrets and Receits in Physick (1655)
  • Some Motives and Incentives to the Love of God (Seraphic Love) (1659)
  • New Experiments Physico-Mechanical, touching the Spring of the Air and its Effects (1660)
  • Certain Physiological Essays (1661)
  • The Sceptical Chymist (1661)
  • Some Considerations touching the Style of the Scriptures (1661)
  • A Defense of the Doctrine Touching the Spring and Weight of the Air (1662)
  • An Examen of Mr. T. Hobbes his Dialogus Physicus De Natura Aeris (1662)
  • Some Considerations Touching the Usefulness of Experimental Natural Philosophy (1663)
  • Experiments and Considerations Touching Colours (1664)
  • New Experiments and Observations Touching Cold (1665)
  • Occasional Reflections upon Several Subjects (1665)
  • Hydrostatical Paradoxes (1666)
  • The Origin of Forms and Qualities (1666)
  • New Experiments Concerning the Relation between Light and Air (1668)
  • A Continuation of New Experiments Physico-Mechnical Touching the Spring and Weight of the Air and their Effects (1669)
  • Of Absolute Rest in Bodies (1669)
  • New Pneumatical Experiments about Respiration (1670)
  • Cosmical Qualities (1670)
  • Of a Discovery of the Admirable Rarefaction of Air (1670)
  • The Usefulness of Natural Philosophy, II (1671)
  • An Essay about the Origin and Virtues of Gems (1672)
  • Flame and Air (1672)
  • Essays of Effluviums (1673)
  • The Saltness of the Sea (1673)
  • The Excellency of Theology Compared with Natural Philosophy (1674)
  • About the Excellency and Grounds of the Mechanical Hypothesis (1674)
  • Some Considerations about the Reconcileableness of Reason and Religion (1675)
  • Experiments, Notes, Etc., about the Mechanical Origin of Qualities (1675)
  • Of a Degradation of Gold Made by an Anti-Elixir (1678)
  • Experiments and Notes about the Producibleness of Chemical Principles (1680)
  • A Continuation of New Experiments Physico-Mechnical Touching the Spring and Weight of the Air, and their Effects (1680)
  • The Aerial Noctiluca (1680)
  • New Experiments and Observations, made upon the icy Noctiluca (1682)
  • A Discourse of Things Above Reason (1681)
  • Memoirs for the Natural History of Human Blood (1684)
  • Experiments and Considerations about the Porosity of Bodies (1684)
  • Of the High Veneration Man’s Intellect owes to God (1684)
  • Short Memoirs for the Natural Experimental History of Mineral Waters (1685)
  • An Essay of the Great Effects of Even Languid and Unheeded Motion (1685)
  • Of the Reconcileableness of Specifick Medicines to the Corpuscular Philosophy (1685)
  • A Free Enquiry into the Vulgarly Received Notion of Nature (1686)
  • The Martyrdom of Theodora and of Didymus (1687)
  • A Disquisition about the Final Causes of Natural Things (1688)
  • Some Receipts of Medicines (1688)
  • Medicina Hydrostatica (1690)
  • The Christian Virtuoso (1690)
  • Experimenta et Observationes Physicae (1691)
  • The General History of Air (1692)
  • Medicinal Experiments (1692)
  • A Free Discourse against Customary Swearing (1695)
  • The Christian Virtuoso, The Second Part (1744)

c. Correspondence

  • The Correspondence of Robert Boyle (Pickering & Chatto, 2001), ed. Michael Hunter, Antonio Clericuzo, and Edward B. Davis.
    • This six-volume edition of Boyle’s correspondence is the standard in the field and a companion to the Pickering & Chatto edition of The Works of Robert Boyle.

d. Work Diaries

  • Boyle diligently kept diaries of his experimental work starting in the 1640s. Thanks to the work of Michael Hunter and Charles Littleton, these are available online at http://www.bbk.ac.uk/boyle/workdiaries/.

e. Biographies

  • Hunter, Michael. Boyle: Between God and Science (Yale, 2009).
    • This is the best biography of Boyle to date, and includes important recent discoveries in Boyle studies.
  • Hunter, Michael. Robert Boyle by Himself and His Friends (Cambridge, 1994).
    • This edited volume of biographical and autobiographical essays about Boyle is noteworthy for the inclusion of fragments from William Wotton’s lost Life of Boyle.
  • Maddison, R.E.W. The Life of the Honourable Robert Boyle (Taylor & Francis, 1969).
    • This is another biography of Boyle with excellent coverage of Boyle’s Oxford period, but the coverage of Boyle’s early life is covered by reprinting Boyle’s own account as presented in the autobiographical An Account of Philaretus During his Minority (also included in Hunter 1994 above).
  • Masson, Flora. Robert Boyle: A Biography (Constable and Company, 1914).
    • An early biography of Boyle with many notable anecdotes.

f. Selected Works on Boyle

  • Alexander, Peter. Ideas, Qualities, and Corpuscles: Locke and Boyle on the External World (Cambridge, 1985).
    • This is an exploration of Boyle’s profound influence on John Locke.
  • Anstey, Peter. The Philosophy of Robert Boyle (Routledge, 2000).
    • This is the first book-length treatment of Boyle’s philosophy.
  • Anstey, Peter. “Boyle Against Thinking Matter,” in Late Medieval and Early Modern Corpuscular Matter Theories, Edited by Christoph Luthy, John Murdoch, and William Newman (Brill, 2001).
  • Baxter, Roberta. Skeptical Chemist: The Story of Robert Boyle (Morgan Reynolds Publishing, 2006).
  • Boas, Marie. Robert Boyle and Seventeenth-Century Chemistry (Cambridge, 1958).
  • Boas-Hall, Marie. Robert Boyle on Natural Philosophy (Indiana University Press, 1965).
  • DiMeo, Michelle. “‘Such a Sister Became Such a Brother’: Lady Ranelagh’s Influence on Robert Boyle,” Intellectual History Review 25.1 (2015), pp. 21-36.
  • Eaton, William. Boyle on Fire: The Mechanical Revolution in Scientific Explanation (Continuum, 2005).
    • This work explores the lasting influence of Boyle’s philosophy of science.
  • Harwood, John. The Early Essays and Ethics of Robert Boyle (Southern Illinois University Press, 1991).
    • This is the only book that presents a detailed analysis of Boyle’s ethics.
  • Hunter, Michael. Robert Boyle Reconsidered (Cambridge, 1994).
    • This edited volume of essays brought about a new appreciation of the significance of Boyle’s natural philosophy.
  • Hunter, Michael. “How Boyle became a Scientist,” History of Science 33.1(1995), pp. 59-103.
    • This article is a detailed account of how Boyle became a scientist.
  • Hunter, Michael. Robert Boyle 1627-1691: Scrupulosity and Science (Boydell, 2000).
    • This work is an in-depth exploration of the relationship between Boyle’s religious views and his natural philosophy. It includes Hunter’s essay, “How Boyle became a Scientist.”
  • Hunter, Michael. Boyle Studies: Aspects of the Life and Thought of Robert Boyle (Ashgate, 2015).
  • Kuslan, Louis, and A. Harris Stone. Robert Boyle: The Great Experimenter (Prentice-Hall, 1970).
    • Although written for children, this short book is an excellent introduction to Boyle’s natural philosophy, with detailed explanations of several of his most important experiments.
  • J.R. Jacob. Robert Boyle and the English Revolution: A Study in Social and Intellectual Change (Burt Franklin, 1977).
  • Newman, William, and Lawrence Principe. Alchemy Tried in the Fire: Starkey, Boyle, and the Fate of Helmontian Chymistry (University of Chicago, 2002).
  • Principe, Lawrence. The Aspiring Adept: Robert Boyle and His Alchemical Quest (Princeton, 1998).
  • Sargent, Rose-Mary. The Diffident Naturalist: Robert Boyle and the Philosophy of Experiment (University of Chicago, 1995).
  • Wojcik, Jan W. Robert Boyle and the Limits of Reason (Cambridge University Press, 2002).

g. Other Important Works

  • Ben-Chaim, Micahel. Experimental Philosophy and the Birth of Empirical Science (Routledge, 2004).
  • Evan Bourke. “Female Involvement, Membership, and Centrality: A Social Network Analysis of the Hartlib Circle,” Literature Compass 14.4 (2017).
  • David, Edward. Creation, Contingency, and Early Modern Science: The Impact of Voluntaristic Theology on Seventeenth Century Natural Philosophy (PhD Dissertation, Indiana University, 1984)
  • Duddy, Thomas. A History of Irish Thought (Routledge, 2002).
  • Frank, Robert G. Harvey and the Oxford Physiologists: A Study of Scientific Ideas (University of California Press, 1980).
  • Garber, Daniel. Descartes’ Metaphysical Physics (University of Chicago Press, 1992).
  • Garber, Daniel. Descartes Embodied: Reading Cartesian Philosophy through Cartesian Science (Cambridge University Press, 2000).
  • Harrison, Peter. “Voluntarism and Early Modern Science,” History of Science 40.1 (2002), pp. 63-89.
  • Harrison, Peter. The Fall of Man and the Foundations of Science (Cambridge University Press, 2007).
  • Hempel, Carl. The Philosophy of Natural Science (Prentice Hall, 1966).
  • Klaaren, Eugene. Religious Origins of Modern Science (William B. Eerdmans Publishing Company, 1977).
  • Osler, Margaret. Divine Will and the Mechanical Philosophy: Gassendi and Descartes on Contingency and Necessity in the Created World (Cambridge University Press, 1994).
  • Webster, Charles. The Great Instauration: Science, Medicine, and Reform 1626-1660 (Holmes and Meier Publishers, 1975)

Author Information

William Eaton
Email: weaton@georgiasouthern.edu
Georgia Southern University
U. S. A.

Reduction and Emergence in Chemistry

Most talk of reduction and emergence figures in discussions about the relation between different physical theories, or between physics and biology. The aim of this article is to present a different perspective through which to examine reduction and emergence; namely, the perspective of chemistry’s relation to physics.

Very broadly, reduction is associated with the idea that the sciences are hierarchically ordered and unified. As a universal thesis, reductionism takes physics to be the most fundamental science in the sense that the laws and postulates of all other sciences can, at least in principle, be derived from and explained by physics. Metaphysically, this implies that things like molecules, cells, chairs and consciousness are nothing more than the physical stuff of which they are made. On the other hand, emergence is often associated with the idea that the special sciences and their postulated entities, properties, and so forth are somehow novel and partially autonomous from physics. On this view, while the special sciences comply with physical laws, they are nevertheless autonomous, and their postulated entities are over and above physical ones. In this context, one cannot explain away molecules, cells and their respective properties by reference only to physical stuff.

The philosophy of chemistry examines in detail whether reduction, emergence, or some other notion correctly characterises chemistry’s relation to physics and, in particular, to quantum mechanics. The philosophy of chemistry illuminates possible ways of thinking of chemistry’s relation to physics, but also of reduction and emergence. Moreover, understanding chemistry’s relation to physics has important implications for how one understands the relation between other sciences. For example, biology often refers to chemical entities and processes in order to explain biological phenomena. Given this, examining chemistry’s relation to physics contributes to understanding biology’s relation to physics. Furthermore, the notions of reduction and emergence are associated with more general philosophical questions about the unity or disunity of the sciences, but also about the very nature and structure of the world. Examining reduction and emergence with respect to chemistry can contribute to these issues. A case in point is the nature and reality of entities and properties in special sciences. For example, if chemical entities are reduced to those of physics, then one could formulate an argument against the existence of chemical entities. On the other hand, if chemical entities somehow emerge from physical ones, then this may suffice to support the reality of chemical entities and of their respective properties.

Table of Contents

  1. Introduction
  2. The Significance of This Topic in the Philosophy of Chemistry
  3. Reduction in Chemistry
    1. Epistemological, or Intertheoretic, Reduction
    2. Antireductionism with Respect to Chemistry
    3. Ontological Reduction
    4. Alternative Forms of Reduction
  4. Emergence in Chemistry
    1. British Emergentism in Chemistry
    2. Strong Emergence
    3. Alternative Forms of Emergence
  5. Beyond Reduction and Emergence
    1. Unity without Reduction
    2. Pluralism
  6. Conclusion
  7. References and Further Reading

1. Introduction

What one means by reduction and emergence can vary extensively, and there are positions which argue for an understanding of chemistry’s relation to physics in a manner that goes beyond the dilemma between reduction and emergence. Nevertheless, all positions can be understood as addressing at least one of two distinct, yet often overlapping, questions:

  1. The question of the relation of the formalism of chemistry to that of physics. This is an epistemic question because it focuses on the relation between theories of chemistry and theories of physics.
  2. The question of the relation of the entities, properties, and so forth that are postulated by chemistry to the entities and so forth that are postulated by physics. This is a metaphysical question because it concerns the nature of chemical entities, properties, and so forth.

Chemistry’s relation to physics is examined with respect to different theories, concepts, entities, properties and phenomena of chemistry and of physics (Hendry 2012; van Brakel 2014). Given this, ‘to speak of “the relation between chemistry and physics” is nonsense: a whole variety of possible intertheoretical relations have to be addressed’ (van Brakel 2014: 34). Both chemistry and physics, understood as scientific disciplines, encompass various sub-disciplines and theories which have, among other things, distinct explanatory and heuristic goals. In light of this, various theories have been examined in the context of chemistry’s relation to physics, including: (a) the relation between thermodynamics and statistical mechanics (Hendry 2012: 369; Needham 2009); (b) the relation of chemistry to quantum mechanics; and, (c) the relation of organic chemistry to quantum chemistry (Goodwin 2013).

Given the above, it is not surprising that the relation between chemistry and physics involves examining the relation between different sets of entities, properties, and so forth that the relevant theories postulate. For example, chemistry’s relation to quantum mechanics has been examined with respect to (a) chemical elements and the periodic table (Scerri 2012b: 75-76); (b) molecular structure (Hendry 2010b; Weininger 194; Woolley 1976); (c) orbitals (Villani et al. 2018); (d) chemical reaction rates (Hettema 2017: 69-86); and (e) the chemical bond (Hendry 2008; Weisberg 2008). Another feature of chemistry’s relation to physics concerns examining how macroscopic substances are related to their constituents (van Brakel 2014: 34). Also, another feature involves examining the relation between the ‘vernacular and scientific use of substance names’ (van Brakel 2014: 34).

While none of the above features of chemistry’s relation to physics are independent from each other, each of them deserves its own article, as each involves addressing issues unique to its specific domain of inquiry. Given this, as well as the fact that reduction and emergence are mostly investigated with respect to chemistry’s relation to quantum mechanics, this article reviews reduction and emergence in the context of how chemistry and its postulated chemical entities relate to quantum mechanics and its postulated entities.

Before presenting the existing views on chemistry’s relation to quantum mechanics, it is useful to briefly specify the subject matter of the two relevant sciences. Chemistry is concerned with the composition and transformation of matter into new substances. It achieves the description, explanation, and prediction of the composition and reaction of matter by reference to entities, properties, and so forth that the theory postulates. In other words, chemistry uses concepts which are characteristic of the chemical description and which allegedly refer to entities, properties, and so forth that determine how matter is composed and reacts. Phenomena that are within the purview of chemistry are the rusting of metals, the properties of atoms and molecules, the boiling of water and the volatility of mercury. Quantum mechanics is the non-relativistic theory that describes microscopic systems (Palgrave Macmillan Ltd 2004: 1863). It is distinct from relativistic quantum mechanics and from quantum field theory. Quantum mechanics achieves the description, explanation, and prediction of microscopic systems by reference to entities and properties that the theory postulates. Phenomena that are within the purview of quantum mechanics are black-body radiation, the double-slit experiment, and the behaviour of a free particle under a magnetic field.

Note that quantum chemistry plays a very important role in understanding the relation between chemistry and quantum mechanics. In the Dictionary of Physics quantum chemistry is defined as the ‘branch of theoretical chemistry in which the methods of quantum mechanics are applied to chemical problems’ (Palgrave Macmillan Ltd 2004: 1845; see also Gavroglu and Simões 2012). In the literature on chemistry’s relation to quantum mechanics, it is not clear whether quantum chemistry is regarded as part of the higher-level theory or the lower-level one (that is, chemistry and quantum mechanics respectively). For example, Goodwin (2013) refers to the relation of quantum chemistry to quantum mechanics, implicitly suggesting that quantum chemistry is the higher-level (chemical) theory. On the other hand, there are philosophers of chemistry who compare the explanatory and predictive success of quantum chemistry with that of chemistry proper, thus implicitly suggesting that quantum chemistry is the lower-level theory.

2. The Significance of This Topic in the Philosophy of Chemistry

According to some members of the philosophy of chemistry community, chemistry is a special science that has not been considered in much detail with respect to its relation with other sciences, including physics (Scerri and Fisher 2015: 3). This is because the philosophy of science and the philosophy of physics take the relation between chemistry and physics to be an unproblematic relation of subordination of the former to the latter (for example van Brakel 2014: 13; Bensaude-Vincent 2008: 16). Epistemically, this broadly means that the descriptions, explanations, and predictions of phenomena that are provided by chemistry can at least in principle be derived from the theories of physics. Metaphysically, this broadly means that the entities, properties, and so forth that are postulated by chemistry are nothing over and above physical entities and properties.

There are two main reasons why physics may be considered ‘ontologically prior’ to chemistry (Hendry 2012: 367). First, if one takes physics to examine those things that make up chemical entities and properties, then this establishes the priority of physics in virtue of the existence of a mereological relation between chemical and physical entities (Hendry 2012: 367). Secondly, physics is considered a universal science in the sense that it sets out, at least in principle, to describe, explain, and predict everything in the world, and not just some subset of phenomena, like chemistry does (Hendry 2012: 367). Dirac’s famous quote is indicative of this stance towards chemistry and of chemistry’s status compared to physics:

The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble. (1929: 714)

In light of this, some members of the community take the investigation of chemistry’s relation to physics to be a central issue in the philosophy of chemistry, as the answer that one gives with respect to this issue determines whether, and in what sense, chemistry is an autonomous scientific discipline (Chang 2015; Lombardi and Labarca 2005). For example, Chang states that

the relationship between physics and chemistry is one of the perennial foundational issues in the philosophy of chemistry. It concerns the very existence and identity of chemistry as an independent scientific discipline. Chemistry is also the most immediate territory that physics must conquer if its “imperialistic” claim to be the foundation for all sciences is to have any promise. (Chang 2015: 193)

Some members of the philosophy of chemistry community take the investigation of chemistry’s relation to physics to be central not only for establishing the autonomy of chemistry, but also for ensuring the legitimacy of the philosophy of chemistry as a worthwhile and autonomous field of philosophy (in particular see Lombardi and Labarca 2005; Lombardi and Labarca 2007; Scerri and Fisher 2015; Schummer 2014a: 1-2; van Brakel 1999). For example, Scerri and Fisher state that

the philosophy of chemistry had been mostly ignored as a field, in contrast to that of physics and, later, biology. This seems to have been due to a rather conservative, and at times implicitly reductionist, philosophy of physics whose voice seemed to speak for the general philosophy of science. It has taken an enormous effort by dedicated scholars around the globe to get beyond the idea that chemistry merely provides case studies for established metaphysical and epistemological doctrines in the philosophy of physics. These efforts have resulted in both definitive declarations of the philosophy of chemistry to be an autonomous field of inquiry and a number of edited volumes and monographs. (2015: 3)

Lombardi and Labarca state something similar regarding the ‘traditional assumption’ of reduction:

This traditional assumption not only deprives the philosophy of chemistry of legitimacy as a field of philosophical inquiry, but also counts against the autonomy of chemistry as a scientific discipline: whereas physics turns out to be a ‘fundamental’ science that describes reality in its deepest aspects, chemistry is conceived as a mere ‘phenomenological’ science, that only describes phenomena as they appear to us. (2005: 126)

Given the above, it is no surprise that chemistry’s relation to physics has received such attention in the philosophy of chemistry. This does not mean that all philosophers who investigate the relation of chemistry to physics do so with the intention of defending the legitimacy of the philosophy of chemistry or the autonomy of chemistry. In fact, many examine the question of chemistry’s relation to physics because they take it to be relevant to the investigation of other philosophical issues, such as the reality of chemical entities and the relation between biology and physics. For example, Needham believes that views regarding biology’s reduction to physics, as they are discussed in the philosophy of mind and biology, presuppose the successful reduction of chemistry to physics (Needham 1999: 169). Therefore, the question of the relation of chemistry to physics is central not only for chemistry and the philosophy of chemistry in the manner outlined above, but also for the sciences and general philosophy as well.

3. Reduction in Chemistry

Discussion of reduction with respect to chemistry primarily occurs in the context of the distinction between epistemological and ontological reduction. In the philosophy of chemistry, epistemological reduction requires ‘that the laws of chemistry be derivable from those of physics’ (Hendry and Needham 2007: 339). Ontological reduction ‘requires only that chemical properties are determined by “more fundamental” properties’ (Hendry and Needham 2007: 339). By and large, this distinction is accepted in the literature, though there are philosophers that argue that this distinction is not helpful in spelling out correctly the relation between the two theories (Needham 2010: 169; Hettema 2012b: 164). It is worth noting that Hendry and Needham prefer using the term ‘intertheoretic reduction’ instead of ‘epistemological reduction’ as they think that the former term captures best the sort of reduction that is investigated; namely a reduction which ‘involves logical relationships between theories, rather than knowledge’ (Hendry and Needham 2007: 339).

a. Epistemological, or Intertheoretic, Reduction

Discussion of epistemological, or intertheoretic, reduction primarily happens in the context of Nagel’s account of reduction. In the philosophy of chemistry, a Nagelian reduction is understood as requiring at least, in principle, the derivation or deduction of chemistry from quantum mechanics (Needham 2010: 164; Hettema 2017: 7). A Nagelian reduction consists of two ‘formal’ requirements, namely the ‘connectability and derivability’ of the two theories ((Scerri 1994: 160), see also (Hettema 2017: 7)). Moreover, the reduction of chemistry to quantum mechanics would fall under the cases of heterogeneous reductions. This is because ‘some typically chemical terms cannot be found in the quantum mechanical language’, thus requiring the existence of bridge laws (Scerri 1994: 160; see also Primas 1983: 5). A successful reduction would allegedly be sufficiently supported if the chemical properties of atoms and molecules can, at least in principle, be calculated by quantum mechanics ‘entirely from first principles, without recourse to any experimental input whatsoever’ (Scerri 1994: 162). Note that the latter form of quantum mechanics is often referred to as ‘ab initio quantum mechanics’ (Scerri 1994; Schwarz 2007).

In the philosophy of chemistry there has been debate on what the appropriate criteria are for a successful Nagelian reduction of chemistry to physics (see for example Hettema 2012a; 2017; Needham 1999; 2010; Scerri 1994). For example, Hettema claims that the use of the term ‘Nagelian’ with reference to the aforementioned understanding of reduction is to an extent misleading because Nagel was not so strict in his account of reduction:

Reduction is too often conceived of as a straightforward derivation or deduction of the laws and concepts of the theory to be reduced to a reducing theory, notwithstanding Nagel’s insistence that heterogeneous reduction simply does not work that way. (Hettema 2017: 1-2; see also Hettema 2012b: 146; Dizadji-Bahmani, Frigg and Hartmann 2010; Fazekas 2009; Klein 2009; Nagel 1979; van Riel 2011)

While Nagel’s account of reduction is the most widely discussed account in the philosophy of chemistry, there are other accounts from philosophy. They include Oppenheim’s and Putnam’s account of micro-reduction (Oppenheim and Putnam 1958; Hendry 2012: 368-369). Very briefly, according to this account of reduction, a theory T1 micro-reduces a theory T2 if (i) the phenomena that are explained by T2 can be explained by T1; and (ii) T1 describes the parts of the entities, properties, and so forth that are postulated by T2. According to Hendry, if ‘the micro reductive explanation takes the form of a deduction’, then Oppenheim’s and Putnam’s account is a kind of Nagelian reduction (Hendry 2012: 369).

Nagel, Oppenheim and Putnam take chemistry’s relation to physics to be a paradigmatic case of their respective accounts of reduction (Hendry 2012: 369). A large, though not the entire, part of the philosophy of chemistry literature discusses reduction by investigating whether these accounts of reduction correctly apply to chemistry’s relation to quantum mechanics. Popper’s understanding of reduction has also been investigated in the context of chemistry’s relation to quantum mechanics (Scerri 1998; Needham 1999).

The epistemological reduction of chemistry to quantum mechanics is primarily examined by looking at how quantum mechanics, via the Schrödinger equation, describes the chemical properties of atoms and molecules. Given this, it is useful to briefly present how quantum chemistry employs the Schrödinger equation in order to describe the chemical properties of atoms and molecules. This sub-section henceforth focuses on the non-relativistic Schrödinger equation since this is the one that is standardly employed for the description of atoms and molecules and that is discussed with respect to chemistry’s relation to quantum mechanics.

The Schrödinger equation is the ‘equation of motion for the wave function’ which describes ‘the state of a quantum-mechanical system, and (more generally) for the corresponding state-vector’ (Palgrave Macmillan Ltd 2004: 2029). The solutions of the time-dependent Schrödinger equation (Ψ(x,t)) are (potentially) the wavefunctions of the system under examination (that is of an electron, atom, molecule and so forth).

The generic form of the time-dependent Schrödinger equation is the following:

iħ ∂Ψ(x,t)/ ∂t = – (ħ2/2m)(∂2Ψ(x,t)/∂x2) + VΨ(x,t),

where

∂: partial derivative

Ψ(x,t): a system’s wavefunction

ħ: Planck’s constant

m: the system’s mass

x: position

t: time

V: potential energy

i: imaginary unit (square root of negative one)

If one assumes that a system’s potential energy is independent of time, then it is possible to solve the Schrödinger equation using the method of separation of variables (Griffiths 2005: 24). In this context, the resulting solutions are wavefunctions of the following form (Griffiths 2005: 24):

Ψ(x,t) = ψ(x)φ(t),

where

ψ: a function of position

φ: a function of time

Based on the ability to separate the variables of the Schrödinger equation, it is possible to formulate the time-independent Schrödinger equation, which is an equation independent of time and whose solutions are a system’s time-independent wavefunctions, ψ(x). These wavefunctions correspond to the stationary states of the system under examination.

The time-independent Schrödinger equation does not yield a unique solution (that is, one wavefunction) (Griffiths 2005: 27). It yields an infinite number of solutions (ψ(x1), ψ(x2), …), each of which corresponds to a different state of the system under examination. In accordance with the superposition principle, any linear combination of the solutions of the time-independent Schrödinger equation is also regarded as a wavefunction that represents a possible state of the system (Griffiths 2005: 27).

The stationary state of a system, through its wavefunction ψ(x), provides useful information about the total state of the system, Ψ(x,t). First, the probability density Ψ(x,t) equals ∣ψ(x)∣2. This means that knowledge of just the stationary state of a system, through the solution of the time-independent Schrödinger equation, provides the probability of finding the system at a particular region in space. Secondly, it is possible to calculate the expectation value of any dynamical variable of a state of the system through the stationary state of the system alone (Griffiths 2005: 26). Stationary states are states of definite total energy, E (Griffiths 2005: 26). Each solution to the time-independent Schrödinger equation is associated with a particular allowed total energy of the system (E1, E2, …). The wavefunction that is associated with the minimum total energy corresponds to the ground state of the system, whereas the wavefunctions whose total energies are larger correspond to the excited states of the system.

The time-independent Schrödinger equation for an isolated molecule provides an infinite number of solutions (that is, wavefunctions), each of which corresponds to different stationary states of the molecule. For example, a stable isolated molecule, in virtue of being stable, is said to be in the ground state. From this, it follows that it is represented by the wavefunction that is associated with the system’s ground state and that it has the minimum total energy.

The Hamiltonian operator plays a central role in the solution of the time-independent Schrödinger equation for quantum systems and isolated molecules in particular. When the system under examination is an isolated molecule, the Hamiltonian operator corresponds to the total energy of the molecule (that is, its eigenvalues are the total energy of each state of the molecule); hence it is called the molecular Hamiltonian. In principle, the molecular Hamiltonian operator includes all the factors that determine the kinetic and dynamic energy of the molecule. That is, it should take into account the kinetic energy of each nucleus and electron in the system, the repulsion between each pair of electrons and between each pair of nuclei, and the attraction between each pair of electron and nucleus.

Because of the mathematical complexity involved in the formulation of the Hamiltonian operator, atomic and molecular systems are examined within the framework of the Born-Oppenheimer approximation (henceforth BO approximation; also referred to as the adiabatic approximation). The BO approximation is a ‘(r)epresentation of the complete wavefunction as a product of an electronic and a nuclear part Ψ(r,R) = Ψe( r,R) ΨN(R)’ (IUPAC 2014: 179). The validity of the BO approximation is ‘founded on the fact that the ratio of electronic to nuclear mass […] is sufficiently small and the nuclei, as compared to the rapidly moving electrons, appear to be fixed’ (IUPAC 2014: 179).

Within the BO approximation, one can in principle formulate the Hamiltonian operator by positioning the nuclei at all the possible fixed positions. Each set of nucleonic positions corresponds to different quantum states of the system (hence to different wavefunctions) and to different values of the total energy, E, of the atom or molecule. However, in practice this process is not followed. By having prior knowledge of the quantum system that is under examination—for example, by knowing the chemical and structural properties of the examined molecule—only particular nucleonic conformations are considered when constructing the Hamiltonian operator.

The BO approximation is a feature of quantum mechanics which plays a central role in the investigation of chemistry’s relation to quantum mechanics (Bishop 2010: 173; van Brakel 2014: 31-33; Woolley 1976; 1978; 1991; 1998; Woolley and Sutcliffe 1977; Sutcliffe and Woolley 2012). It has often been invoked as putative empirical evidence for the rejection of chemistry’s reduction to quantum mechanics as well as for the support of the emergence of chemistry (see next sections). Solving the equation outside the BO approximation in order to describe atomic and molecular properties is currently investigated in chemistry and quantum chemistry (for example Tapia 2006). This implies that there are features of quantum mechanics which may further contribute to our understanding of chemistry’s relation to quantum mechanics (for example Woolley 1991).

Note that even when the nucleonic conformation is fixed in the manner represented by the BO approximation, calculating the solution of the Schrödinger equation remains a complicated task. Each nucleonic conformation is compatible with different quantum states of the system (and thus different wavefunctions). This is compatible with chemistry’s understanding of atoms and molecules because, even if the nuclei are fixed at particular positions, the electrons may behave in more than one possible way within an atom or molecule.

In light of the above, the Schrödinger equation is not solved analytically for all atoms and molecules. As Hendry states:

There is an exact analytical solution to the non-relativistic Schrödinger equation for the hydrogen atom and other one-electron systems, but these are special cases on account of their simplicity and symmetry properties. (Hendry 2010a: 212)

Instead, researchers have developed various approximate methods in order to solve it, most of which employ the BO approximation. In general, the development of computation has led to the proliferation of complex computational methods that solve the equation by following different mathematical strategies and by making different assumptions. These methods include the Valence Bond Approach, the Molecular Orbital Approach, the Hartree-Fock Method and Configuration Interaction.

Based on the above, there are philosophers who argue in favour of the epistemological reduction of chemistry to quantum mechanics. For example, Schwarz argues that ab initio quantum mechanics can in principle derive all ‘well-defined numerical properties’ of the chemical elements (Schwarz 2007: 168). Ab initio quantum mechanics refers to quantum mechanical methods that are ‘independent of any experiment other than the determination of fundamental constants. The methods are based on the use of the full Schrödinger equation to treat all the electrons of a chemical system’ (IUPAC 2014: 5).

While Schwarz does not examine chemistry’s relation to quantum mechanics in terms of a particular philosophical account of reduction (such as Nagel’s account of reduction), he advocates some sort of reductive relation between chemistry and quantum mechanics. He claims that the ‘difficulty’ of ab initio quantum mechanics to (in practice) derive certain chemical properties is due to the fact that ‘basic qualitative chemical concepts are so vaguely defined’ and ‘fuzzy’ (Schwarz 2007: 172, 174). Given the above, he believes that the periodic system is in a ‘transition phase’ from a primarily ‘empirical model of chemistry’ to ‘an understandable model based in physical theory’ (Schwarz 2007: 173).

The epistemological reduction of chemistry to quantum mechanics is alternatively supported by Bader’s Quantum Theory of Atoms in Molecules (QTAIM) (Bader 1990; Bader and Matta 2013; Matta and Boyd 2007; Matta 2013). The QTAIM provides a topological analysis of electron density through which one derives information regarding atomic and bonding properties. The QTAIM provides experimentally verifiable information regarding the properties of large molecules, by reconstructing their properties from ‘smaller fragments’ (Matta 2013). It is a scientific theory which ‘demonstrates that every measurable property of a system, finite or periodic, can be equated to a sum of contributions from its composite atoms’ (Bader 1990).

Bader takes the QTAIM to provide correct descriptions, explanations and predictions of the chemical properties of matter ((Bader 1990: vi), see also (Bader and Matta 2013), (Causá et al. 2014), (Hettema 2012a) and (Hettema 2013)). While Bader does not explicitly talk about the reduction of chemistry to quantum mechanics in philosophical terms, his account is regarded in the philosophy of chemistry as representing ‘a proper, (reductionist) basis for chemistry’ (Hettema 2013: 311). This is because, according to Bader and Matta, the QTAIM allegedly supports the claim that ‘chemistry is physics’ (Bader and Matta 2013: 254). However, Hettema argues that while Bader’s view of the QTAIM suggests that the QTAIM is related to chemistry in a manner that closely resembles Kemeny and Oppenheims’ reductive eliminativist account, the QTAIM fails to be a reductive theory of this sort (Hettema 2013). Moreover, Arriaga, Fortin and Lombardi argue that while the QTAIM manages to ‘provide a rigorous definition of the chemical bond and of atoms in a molecule, it appeals to concepts that are unacceptable in the quantum-chemical context’, thus failing to sufficiently support the reduction of chemistry to quantum mechanics (Arriaga et al. 2019: 125). Van Brakel makes a similar point, arguing that the QTAIM works only after postulating facts from chemistry (van Brakel 2014: 32), thus rendering it insufficient for the support of chemistry’s reduction to quantum mechanics.

b. Antireductionism with Respect to Chemistry

Many members of the philosophy of chemistry community reject the epistemological reduction of chemistry to quantum mechanics, as understood in terms of the aforementioned accounts. As Hettema states:

The idea that chemistry stands in a reductive relationship to physics still is a somewhat unfashionable doctrine in the philosophy of chemistry. (2017: 1)

Indeed, there are alternative and often incompatible positions in the philosophy of chemistry which argue, either explicitly or implicitly, against the reduction of chemistry to quantum mechanics. These antireductionist views can be divided into two main camps (Scerri 2007b). First are those positions which reject the reduction of chemistry tout court (Schummer 1998; Schummer 2014b; van Brakel 2000). That is, they ‘deny the whole enterprise’ of reducing chemistry to quantum mechanics on grounds that have to do with the unique methodological, classificatory or other epistemological features of chemistry (Scerri 2007b: 70). Philosophers that follow this antireductionist approach support, either implicitly or explicitly, the irreducibility of chemistry by arguing that chemistry, in virtue of being a science of substances which employs unique classificatory tools and concepts, cannot be reduced to a science which looks at the micro-constituents of those substances and which disregards the classificatory or methodological tools and concepts that are of interest to chemists.

In the second camp are those positions which examine in detail how quantum mechanics describes, predicts, and explains particular chemical entities, properties, and so forth (such as the chemical bond, molecular structure, orbitals and the periodic system). They consider how quantum mechanics describes particular chemical properties and through this analysis they implicitly or explicitly argue against the reduction of chemistry to quantum mechanics (Bogaard 1978; González et al. 2019; Hendry 1998; 1999; 2010a; 2012; Ramsey 1997; Scerri 1994; 1998; Woolley 1976; 1978; 1985; 1998; Woolley and Sutcliffe 1977; Weininger 1984; Woody 2000).

For example, Scerri evaluates the manner in which the Schrödinger equation is solved so as to yield accurate results about the properties of atoms and molecules. He claims that ab initio quantum mechanics has yielded relatively accurate results regarding the ground-state energy of particular atoms and has acknowledged the success of quantum mechanics in providing a mathematical analysis of chemical phenomena and in generating sufficiently accurate quantitative values of chemical properties such as bond strength and dipole moments (2007b; 2012). However, he takes that this does not sufficiently support the reduction of chemistry to quantum mechanics (Scerri 1994: 164). Specifically, the approximate methods that are employed for the solution of the Schrödinger equation—and without which a solution cannot be provided—involve the use of ad hoc assumptions which, in virtue of being ad hoc and reliant ‘on experimental data’, undermine the thesis that chemistry is reduced in a Nagelian manner to quantum mechanics (Scerri 1994: 165-168; see also Scerri 1991: 320-321). Note that Hofmann (1990) presents how models and approximations have been employed throughout the history of quantum mechanics for the description of chemical properties; see also Gavroglu and Simões (2012).

Scerri invokes the periodic table and the electronic configuration model as examples that support the failure of chemistry’s reduction to quantum mechanics (Scerri 2007b: 74; Scerri 2012b: 79-80; Scerri 1991).

Before presenting Scerri’s argument, it is useful to briefly define the chemical terms that his and subsequent analyses invoke. The electronic configuration is ‘a distribution of the electrons of an atom or a molecular entity over a set of one-electron wavefunctions called orbitals, according to the Pauli principle’ (IUPAC 2014: 317). An orbital, whether atomic or molecular, is a ‘(w)avefunction depending explicitly on the spatial coordinates of only one electron’ (IUPAC 2014: 1034). An atomic orbital is a ‘(o)ne-electron wavefunction obtained as a solution of the Schrödinger equation for an atom’ (IUPAC 2014: 124). Given that orbitals depend on the spatial coordinates of electrons, the electronic configuration of an atom provides a representation of the distribution of electrons in the atom. This is particularly important in chemistry because it serves as a basis for the explanation and prediction of the type of bonds that are formulated between atoms.

With respect to the periodic table then, Scerri’s claim is broadly the following. The manner in which chemical elements are ordered in the periodic table is partially explained and could be regarded as derived by quantum mechanics because quantum mechanics specifies the electronic configuration of the atoms of each element (Scerri 2012b: 75). However, there are certain features of the periodic table, such as the length of its periods, which are not deducible from quantum mechanics (Scerri 2012b: 77-78). Therefore, the derivation of the periodic table from quantum mechanics, and thus the reduction of chemistry, cannot be sufficiently supported.

Moreover, a Nagelian reduction ‘requires axiomatised versions of the theory to be reduced as well as the reducing theory’, which at least with respect to chemistry cannot possibly be argued for (Scerri 2006: 124). A similar point is made by Hettema regarding Nagelian reduction: ’chemistry is a field, whereas reduction tends to be a relation between individual theories, or between laws and theories’ (Hettema 2017: 1). Furthermore, quantum mechanics does not provide on its own ‘a conceptual understanding of chemical phenomena’ (Scerri 2007b: 74). Instead, chemists employ chemical models and theories in order to formulate sufficient descriptions, explanations, and predictions of chemical phenomena and properties. Another problem for the reduction of chemistry is that quantum mechanics is symmetric under time inversion, and thus cannot provide an explanation of why chemical entities evolve in time the way they do. It can only provide a ‘reductive description’ of chemical properties independent of time (Scerri 2007b: 78). In fact, while quantum mechanics provides numerical values to particular chemical properties, it does not provide a complete explanation of a system’s chemical behaviour (Scerri 2007b: 78).

Scerri also rejects the success of an approximate reduction of chemistry to quantum mechanics (1994; 1998). By approximate reduction, Scerri refers to Putnam’s analysis of reduction, which permits the reducing theory to be approximately and not exactly true (Scerri 1994: 161). That is, ‘the relationships postulated by the theory hold not exactly, but with a certain specifiable degree of error’ (Putnam 1965: 206-207). In this context, reduction is not undermined if ab initio quantum mechanics provides only approximate results of the value of atomic and molecular properties, as long as these results are accompanied by a specifiable degree of error. However, Scerri rejects approximate reduction as the errors ‘are seldom computed by independent ab initio criteria’ (Scerri 1994: 168). Scerri also examines approximate reduction in relation to Popper’s analysis of the reduction of chemistry. In this context, Scerri draws a very similar conclusion with respect to the approximate reduction of chemistry (Scerri 1998: 42).

Based on all the above, Scerri concludes that the reduction of chemistry is ambiguous since, depending on what the set criteria for a successful reduction are, chemistry’s reduction to quantum mechanics ‘is both successful and unsuccessful’ (Scerri 2007b: 76; Scerri 2012b: 80).

Other philosophers also argue that chemistry has failed to epistemically reduce to quantum mechanics by pointing out similar issues with respect to the quantum mechanical description of chemical phenomena (see Bogaard 1978; Hendry 1998; Hendry 2010b: 183; Primas 1983; Woolley 1976; 1998; Woolley and Sutcliffe 1977). For example, Primas argues that quantum mechanics is ‘incorrect and should be revised, partly because [it] seems incapable of rendering a robust account of concepts such as molecular shape’ (Hettema 2017: 53, see also Primas 1983). Bogaard points out that chemists disregard a number of features of the behaviour of subatomic particles when specifying an atom’s or molecule’s Schrödinger equation. These features include (a) the behaviour of subatomic particles (namely protons and neutrons); (b) the energetic contribution of the movement of the nuclei; and, (c) relativistic effects (Bogaard 1978: 346). Moreover, the fact that the Schrödinger equation is ‘adapted’ so as to provide an accurate description of each particular system challenges the view that quantum mechanics can, even in principle, deduce complete explanations of chemical phenomena (Bogaard 1978).

González et al. (2019) argue that there is a tension between the theoretical postulates of quantum mechanics and how molecular structure is understood in chemistry. In particular, Heisenberg’s uncertainty principle implies that a ‘quantum “particle” is not an individual in the traditional sense, since it has properties—those represented by its observables—that have no definite value’ (González et al. 2019: 36). Such a metaphysical understanding of quantum particles comes in contrast to chemistry’s understanding of molecular structure, which is defined ‘in terms of the spatial relation of the nuclei conceived as individual localised objects’ (González et al. 2019: 43). The failure of chemistry’s reduction is further supported by the fact that the Schrödinger equation cannot be solved analytically without the use of approximations and models (for example Bogaard 1978: 347; González et al. 2019; Hendry 2010b). These approximations and models are based on ‘theoretical assumptions drawn from chemistry’, thus rendering the quantum chemical description of complex atoms and molecules in a ‘loose relationship to exact atomic and molecular Schrödinger equations’ (Hendry 2010b: 183).

Lastly, Chang argues that since its advent, quantum chemistry was practiced in a manner that required the use of pre-quantum, chemical knowledge (Chang 2015; 2017). The views of Linus Pauling, one of the main founders of quantum chemistry, allegedly corroborate this argument, as Pauling took quantum chemistry to be ‘a direct continuation of nineteenth-century organic structural chemistry’ (Chang 2015: 197-198). Chang also claims that physics consists of many different branches and that the relation of those branches with more fundamental physical theories has not been decisively shown to be reductionist. In light of this, and given that chemistry’s relation to physics is examined in the context of a physical theory (that is, quantum mechanics) which is not the most fundamental theory in physics, one should not assume chemistry to be unproblematically reduced to physics (Chang 2015: 200; Chang 2017: 365). Thirdly, Chang looks at how chemistry is done in practice and claims from this that chemistry is very far from being ‘submitted’ to physics (Chang 2015: 201). This claim allegedly undermines the reduction of chemistry to quantum mechanics since quantum mechanics has never in practice been sufficient for the description, explanation or prediction of phenomena that are within the purview of chemistry (Chang 2015: 201-202).

c. Ontological Reduction

In light of the above objections against the epistemological reduction of chemistry, there are philosophers who have investigated whether it is possible to support chemistry’s ontological reduction to quantum mechanics in a manner that is consistent with the failure of chemistry’s epistemological reduction. Most notable is Le Poidevin, who formulated a detailed account for the ontological reduction of chemical properties which does not depend on the success of an epistemic reduction of chemistry to quantum mechanics. In fact, Le Poidevin accepts that chemistry has not been epistemically reduced to quantum mechanics and argues that, despite this, it can be argued that chemical elements are ontologically reduced to physical properties. He claims that the argument for the ontological reduction of chemical elements can be generalised to all chemical properties in the following manner:

Chemical properties reduce to those properties variation in which is discrete, and combinations of which constitute the series of physically possible chemical properties. (Le Poidevin 2005: 132)

In particular, he takes that the discreteness of chemical elements as specified via the periodic table supports a combinatorial argument for their ontological reduction. According to this argument, ‘a finite number of fundamental entities combine together to give a discrete set of composite elements’ (Scerri 2007a: 929).

Le Poidevin’s argument is based on two premises. The first is the ‘combinatorial criterion for ontological reduction’, which states that

a property type F is ontologically reducible to a more fundamental property type G if the possibility of something’s being F is constituted by a recombination of actual instances of G, but the possibility of something’s being G is not contributed by a recombination of actual instances of F. (Le Poidevin 2005: 132)

The second premise concerns the ‘discreteness of chemical ordering’: ‘between any two elements there is a finite number of physically possible intermediate elements’ (Le Poidevin 2005: 132).

According to Le Poidevin, the combinatorial criterion for the ontological reduction of chemical properties is preferable to existing physicalist accounts regarding the ontological reduction of special science properties because it overcomes two insurmountable problems of physicalism. The first problem is the ‘vacuity problem’, according to which physicalism is in danger of becoming a trivial thesis depending on what one takes to be included in the domain of physics (Le Poidevin 2005: 121-122). The second problem is the ‘asymmetry problem’, according to which the supervenience relation, as postulated by physicalism, does not necessitate an asymmetric relation between higher and lower-level properties (Le Poidevin 2005: 122).

Scerri, Hendry and Needham are sympathetic towards Le Poidevin’s argument of the ontological reduction of chemical elements (Scerri 2007b: 76; Hendry and Needham 2007: 340). As Hendry and Needham state, the combinatorial argument establishes that ‘the discreteness of the elements is explained by the nomologically required discrete variation in a physical quantity, namely nuclear charge’ (Hendry and Nedham 2007: 34). However, all of them take that there are certain problematic features in Le Poidevin’s account.

First, the argument is allegedly not well-supported for all chemical properties. Scerri doubts that the combinatorial argument can be generalised so as to apply to all chemical properties because, unlike chemical elements, most chemical properties are not discreet (such as the solubility and acidity of elements) (Scerri 2007a: 929). Similarly, Hendry and Needham argue that the combinatorial argument is only investigated with respect to chemical elements, thus disregarding a large part of chemistry. This is a central shortcoming of Le Poidevin’s account because there are particular features of chemistry and of quantum mechanics which are often regarded as posing unique challenges to chemistry’s reduction to quantum mechanics. For example, the structure of molecules is a chemical property which some argue is not in principle derivable by quantum mechanics (Hendry and Needham 2007: 341-342). This is regarded problematic for the reduction of chemistry to quantum mechanics, whether epistemic or ontological. Another issue is how chemistry describes the rate of chemical reactions. Kinetic theory and thermodynamics play a fundamental role in explaining and describing the rate of chemical reactions, and thus need to be considered in the context of chemistry’s relation to quantum mechanics (Hendry and Needham 2007: 343-344). These are problems that concern particular chemical properties and which need to be tackled if any account of (ontological) reduction is to be well-supported for all chemical properties.

Secondly, Scerri takes that Le Poidevin’s attempt to circumvent any talk about the epistemic reduction between the two relevant theories is illusory. The latter takes that a ‘periodic ordering is a classification rather than a theory’, thus rendering his account of ontological reduction ‘theory-neutral’ (Le Poidevin 2005: 131). However, Scerri disagrees on this point as he takes reference to the periodic table to inevitably require the investigation of how chemistry and quantum mechanics are epistemically related (Scerri 2007a: 929). Hendry and Needham take this point a step further by suggesting that reference to a theory cannot be avoided when specifying the micro-constituents of chemical elements (Hendry and Needham 2007: 344). In fact, they argue that there is ‘a close evidential connection’ between epistemological and ontological reduction; one cannot entirely avoid the investigation of inter-theoretic reduction when seeking to provide sufficient empirical support to ontological reduction (Hendry and Needham 2007: 351).

Another objection to Le Poidevin’s account is that the combinatorial argument, even if correct, does not succeed in establishing the ontological reduction of chemistry to physics. The asymmetric relation that Le Poidevin allegedly establishes via his combinatorial argument establishes ‘only an asymmetrical relationship between the (actual) physical and the (merely possible) chemical’ (Hendry and Needham 2007: 349). Given this, such a relation does not preclude the possibility of chemical properties having novel causal powers, thus rendering Le Poidevin’s account consistent with non-reductive (metaphysical) accounts (such as emergentist accounts) (Hendry and Needham 2007: 350).

Hendry also offers independent support to the claim that chemistry fails to ontologically reduce to quantum mechanics, outside his critique of Le Poidevin’s account. Specifically, he assumes that ontological reduction involves the acceptance of the causal completeness of physics (Hendry 2010b: 187). Given this, it follows that ontological reduction is committed to the claim that only physical entities, properties, and so forth possess novel causal powers (Hendry 2010b: 187). Based on this understanding of ontological reduction, he argues that what he calls the ‘symmetry problem’ undermines the tenability of ontological reduction. The symmetry problem arises from the fact that, for any atom or molecule, the arbitrary solutions of the Schrödinger equation are spherically symmetrical (Hendry 2010b: 186). This comes in contrast to the asymmetry exhibited by polyatomic molecules and which chemistry invokes in order to explain many of their chemical properties, such as the acidic behaviour and boiling point of the hydrogen chloride molecule (Hendry 2010b: 186). The symmetry problem allegedly challenges the ontological reduction of chemistry because it undermines the tenability of the causal completeness of physics, namely the principle that every physical effect has a physical cause (Hendry 2010b: 187). This is because

  • quantum mechanics is consistent with the view that the asymmetry of molecules ‘is not conferred by the molecule’s physical basis according to physical laws’ (Hendry 2010b: 187); and
  • the symmetry problem ‘removes much of the empirical support that is claimed for’ the causal completeness of physics (Hendry 2010b: 187).

Lastly, it should be noted that there are positions which argue for the ontological autonomy of chemistry in a manner that is implicitly or explicitly incompatible with the ontological reduction of chemistry to quantum mechanics. This includes Lombardi and Labarca (2005) and Schummer (2014b) (see subsection 5b).

d. Alternative Forms of Reduction

Despite the arguments against chemistry’s epistemological and ontological reduction to quantum mechanics, there are philosophers who attempt to establish reduction. For example, Hettema states that ‘the widespread rejection of reduction by philosophers of chemistry might have been premature’ (Hettema 2012b: 147). Hettema argues that, contrary to how Nagel’s account of reduction has been understood and argued against in the philosophy of chemistry, Nagel was in fact not so strict about the requirements for reduction (Hettema 2014: 193; see also Hettema 2012a). In light of this, Hettema proposes ‘a suitable paraphrase of the Nagelian reduction programme’ which is ‘reinforced by a modern notion of both connectibility and derivability’ (Hettema 2017: 24) (italics are in the original text). Hettema’s position is a reductive account which advocates the existence of autonomous areas. Characterising Hettema’s account as a form of reduction is justified given the quotes just mentioned. Nevertheless, it should be noted that Hettema often refers to his proposal as one that advocates a form of unity (for example Hettema 2012b; 2017). In order to explicate his proposal, Hettema analyses the development of the reaction rate theory and presents, among other things, Eyring’s theory of absolute reaction rates (2017: 71-81; see also Hettema 2012b) (see also subsection 5a).

Needham has also investigated reduction and identified those aspects of Nagelian reduction which should be amended for a more convincing defence of chemistry’s reduction to physics to be achieved. As Needham states:

Chemistry is, perhaps, so entwined with physics that what would be left after removal of physics is but a pale shadow of modern chemistry. It is, perhaps, not even clear what the removal of physics from chemistry would amount to. (Needham 2010: 163)

Needham identifies the weaknesses of Nagelian reduction and examines whether historical developments in chemistry and physics are consonant with how reduction tells us that two theories are related (2010: 170). Based on such an analysis, he argues that it is possible to understand Nagelian reduction in a way that permits and takes into account the use of approximations in science (Needham 2010: 168-169).

4. Emergence in Chemistry

The emergence of chemistry was first discussed and defended by British Emergentists. British Emergentism defended the emergence of chemistry before the advent of quantum mechanics. With the development of quantum mechanics and quantum chemistry, the emergence of chemistry, as it was advocated by British emergentists, was mostly rejected in philosophy. However, in the contemporary literature the emergence of chemistry from quantum mechanics has been reformulated and supported on new grounds. Perhaps the most detailed and widely discussed account of emergence with respect to chemistry is Robin Hendry’s account of the strong emergence of molecular structure. However, there are also alternative understandings of emergence within the philosophy of chemistry.

a. British Emergentism in Chemistry

 British Emergentism refers to a group of philosophers in the 19th and 20th centuries which is regarded as the first to provide a detailed and coherent philosophical account of emergence. Among the examples that British Emergentists invoked in order to support the existence of emergence is that of chemistry and in particular of chemical bonding. In particular, J. S. Mill argued that ‘the different actions of a chemical compound will never, undoubtedly, be found to be the sums of the actions of its separate elements’ (quote in McLaughlin 1992: 28; see also Mill 1930). C. D. Broad also advocated the emergence of chemistry on the grounds that it is not ‘theoretically possible to deduce the characteristic behaviour of any element from an adequate knowledge of the number and arrangement of the particles in its atom, without needing to observe a sample of that substance’ (Broad 1925: 70; see also McLaughlin 1992: 47; Hendry 2006: 176-180; Hendry 2010a: 210; Hendry 2010b: 185).

The putative empirical evidence that emergentists invoked for the support of the emergence of chemical bonding is the failure to deduce the chemical behaviour of elements from the entities and properties that constitute those chemical elements. Since one does not describe and predict how chemical elements are bonded to each other only with reference to the entities that compose them, then this suffices to support that chemical bonding is an emergent chemical property which exerts downward causal powers to the entities that constitute the relevant chemical elements (Scerri 2007a: 921).

The British Emergentists’ argument for the emergence of chemical bonding was formulated before the advent of quantum mechanics. According to McLaughlin, once quantum mechanics contributed to the understanding of atomic and molecular properties, including the chemical bond, the emergence of chemical bonding was no longer justified in the manner that British Emergentism advocated:

Quantum mechanical explanations of chemical bonding suffice to refute central aspects of Broad’s Chemical Emergentism: Chemical bonding can be explained by properties of electrons, and there are no fundamental chemical forces. (Mclaughlin 1992: 49; see also Scerri 2007a)

On the other hand, Scerri argues that McLaughlin is mistaken to reject the emergence of chemistry and rejects McLaughlin’s claims that

  • there was no complete or adequate theory of chemical bonding before the advent of quantum mechanics; and
  • quantum mechanics provided a complete theory of chemical bonding (Scerri 2007a: 922-923; see also Scerri 2012a).

In fact, Scerri claims that the quantum mechanical theory of chemical bonding should be viewed as continuous and as enhancing Lewis’s theory of chemical bonding (Scerri 2007a: 922-923). The advent of quantum mechanics does not refute pre-quantum, chemical theories of bonding, but rather offers a deeper understanding of chemical bonding. Chemistry remains vital in the description and explanation of the chemical behaviour of elements because quantum mechanics cannot offer by itself a complete account of chemical bonding and of the overall chemical behaviour of elements. While quantum mechanics provides quantitative information regarding particular chemical properties of elements and compounds, it ‘cannot predict what compounds will actually form’ (Scerri 2007a: 924). Quantum mechanics can neither provide an explanation of how atoms and molecules evolve in time, nor can it provide a complete explanation of their overall chemical behaviour (Scerri 2007b: 78). These two characteristics of quantum mechanics, apart from blocking the possibility of a ‘complete’ reduction of chemistry, also allegedly support the claim that chemical entities and properties emerge at a level ‘over and above what one would expect from the constituents of the system’ (Scerri 2007b: 77; see also Llored 2012: 254). What Scerri means by emergence is, however, unclear since he only specifies this notion contrary to physicalism and does not provide a detailed account of the emergence of chemistry.

b. Strong Emergence

Hendry formulates one of the most detailed and widely discussed accounts of emergence regarding chemistry. Hendry’s account focuses on a metaphysical understanding of emergence that has direct implications on the metaphysical relation between chemical and quantum mechanical entities and properties, as well as on the nature of molecular structure. His account of strong emergence is formulated in terms of downward causation, and the putative empirical evidence that supports his position is drawn from the manner in which quantum mechanics and chemistry each describe molecular structure.

According to Hendry, the structure of a molecule strongly emerges from its quantum mechanical entities in the sense that it exhibits downward causal powers. Specifically, ‘the emergent behaviour of complex systems must be viewed as determining, but not being fully determined by, the behaviour of their constituent parts’ (Hendry 2006: 180).

Strong emergence is supported by the ‘counternomic criterion for downward causation’ (Hendry 2010b: 189). According to this criterion, ‘a system exhibits downward causation if its behavior would be different were it determined by the more basic laws governing the stuff of which it is made’ (Hendry 2010b: 189). The manner in which quantum mechanics describes a molecule’s structure allegedly satisfies the counternomic criterion and thus supports the view that molecular structure strongly emerges.

In order to support this claim, Hendry makes a distinction between ‘resultant’ and ‘configurational’ Hamiltonians. A molecule’s resultant Hamiltonian takes into account all the intra-molecular interactions and is constructed using as input only fundamental physical interactions and the value of the physical properties of the entities (such as masses, charges, and so forth) (Hendry 2010a: 210-211). Given the resultant Hamiltonian, the so-called ‘Coulombic Schrödinger equation’ is constructed, which is a complete and exact description of the relevant molecule. However, the resultant Hamiltonian is in practice never used for the solution of the Schrödinger equation. This is primarily due to the equation’s mathematical complexity. Nevertheless, if the Coulombic Schrödinger equation were to be solved, it would not distinguish between different molecular structures (specifically that of isomers), and it would not explain the symmetry properties of a molecule. Instead, quantum explanations of molecular structure are based on the construction of ‘configurational Hamiltonians’ for the solution of the Schrödinger equation of a molecule (Hendry 2010a: 210-211). Configurational Hamiltonians are constructed on the basis of ad hoc assumptions which impose on the Schrödinger equation the molecular structure that is supposed to be derived from that equation. This situation satisfies the counternomic criterion because we did not recover a molecule’s ‘structure from the “resultant” Hamiltonian, given the charges and masses of the various electrons and nuclei; rather we viewed the motions of those electrons and nuclei as constrained by the molecule of which they are part’ (Hendry 2006: 183).

Hendry presents two examples that illustrate that the counternomic criterion is satisfied with respect to molecular structure. The first example concerns isomers (see also Bishop 2010: 172-173). Isomers are sets of molecules that contain the same number and kind of atoms, but whose atoms are arranged differently. This means that isomers differ only in terms of their structure. Isomers have distinct chemical descriptions and they are invoked for the explanation of a variety of physical and chemical phenomena. If one is to describe an isomer via the use of its resultant Hamiltonian, then the Coulombic Schrödinger equation is identical with the Coulombic Schrödinger equations that describe the other relevant isomers (Hendry 2017: 153). On the other hand, if one is to describe an isomer via the use of its configurational Hamiltonian, then the Schrödinger equation that is subsequently constructed, is not identical to those that describe the other relevant isomers. According to Hendry, this means that this example satisfies the counternomic criterion. He thinks it illustrates that the molecule’s behaviour, as this is described ‘by the more basic laws governing the stuff of which it is made’ (that is, via the resultant Hamiltonian) is different from its behaviour, as this is described by assuming certain chemical properties (namely, its structure) via the configurational Hamiltonian.

The second example that Hendry takes as empirical support for downward causation involves the symmetry properties of molecules. Similarly to the case of isomers, one cannot derive the different chemical symmetry properties from the relevant resultant Hamiltonian because ‘the only force appearing in molecular Schrödinger equations is the electrostatic or Coulomb force: other forces are negligible at the relevant scales. But the Coulomb force has spherical symmetry’ (Hendry 2017: 154).

As is the case with other accounts of strong emergence in philosophy of science, Hendry’s account of strong emergence overcomes the overdetermination problem by postulating that there are certain quantum mechanical effects which do not have purely quantum mechanical causes (Wilson 2015: 353). That is, accounts of strong emergence deny the causal completeness of the physical (CCP), which states that ‘every lower-level physically acceptable effect has a purely lower-level physically acceptable cause’ (Wilson 2015: 352). Instead of the CCP, Hendry proposes an alternative principle; namely the ‘ubiquity of physics’ (UP):

Under the ubiquity of physics, physical principles constrain the motions of particular systems though they may not fully determine them. (Hendry 2010b: 188)

This principle acts as a substitute for the causal completeness of the physical (CCP) which Hendry rejects and which is incompatible with his notion of strong emergence. UP allows for the physical principles (as these are formulated via the physical laws and theories) to ‘apply universally without accepting that they fully determine the motions of the systems they govern’ (Hendry 2010b: 188). According to Hendry, unlike UP, the CCP is not well supported by physics itself, and he follows Bishop in thinking it ‘does not imply its own causal closure’ (Bishop 2006: 45). Note that, given the rejection of the CCP, strong emergence, as understood by Hendry, is incompatible with not only some form of epistemic reduction but also with reductive and non-reductive physicalism.

A critique of Hendry’s account in the philosophy of chemistry literature is provided by Scerri, who argues that the putative empirical evidence invoked by the former for the support of strong emergence is merely a ‘theoretical rather than ontological issue’ (Scerri 2012a: 25).

c. Alternative Forms of Emergence

There are alternative accounts of emergence with respect to chemistry. These are mostly accounts which focus on the unique epistemological features of chemistry and propose an understanding of emergence that is primarily epistemic, rather than metaphysical. For example, Bishop and Atmanspacher (2006) formulate an account of ‘contextual emergence’ which they take to successfully apply in two separate cases: namely to the case of molecular structure and to that of temperature (see also Bishop 2010). With respect to molecular structure, they argue that quantum mechanics provides necessary but not sufficient conditions for the description of molecular structure. This implies that reduction is not the appropriate account to correctly specify the relation between the two relevant descriptions. In order to derive a lower-level (that is, quantum mechanical) description of molecular structure, one introduces sufficient conditions by specifying the particular context in which the relevant lower-level system is considered. This allegedly supports the claim that molecular structure is a novel property which is not derivable by the quantum mechanical description alone but rather emerges from it (Bishop and Atmanspacher 2006: 1774; see also Bishop 2010: 176-177; Llored 2012: 248).

Furthermore, Llored presents ‘a relational form of emergence which pays attention to the constitutive role of the modes of intervention and to the co-definition of the levels of organization’ (Llored 2012: 245). This is not a metaphysical account of emergence; as Llored states, his proposed account is ‘agnostic’ with respect to the ontology of chemistry and rather focuses on ‘what chemists do in their daily work’ (Llored 2012: 245). In particular, Llored looks at how ‘from the Twenties to nowadays, quantum chemical methods have been constitutively concerned with the links between the molecule and its parts’ (2012: 257) (italics are in the original text). Among other things, he presents and analyses the debate between Linus Pauling and Robert Mulliken who both ‘focused on the description and the understanding of the molecule, its reactivity, and thus its transformations’ (Llored 2012: 257). Llored argues that his proposed account of emergence is not one which advocates an asymmetric relation between higher and lower-level properties. Rather, both chemical and quantum mechanical properties ‘co-emerge’ (Llored 2014: 156). Chemical phenomena are understood ‘as relative to a certain experimental context, with no possibility of separating them from this context’ (Llored 2014: 156; see also Llored and Harré 2014).

5. Beyond Reduction and Emergence

Very few accounts consider the relation of chemistry to quantum mechanics without invoking some form of reduction or emergence. In fact, if we are to understand epistemic reduction and strong emergence as the two extremes of a spectrum of inter-theoretic accounts, then there is a variety of positions that have remained to this day relatively unexplored with respect to chemistry. Nevertheless, there are some philosophers who consider the possibility of understanding chemistry’s relation to quantum mechanics without reference to reduction or emergence. This section distinguishes between two main camps. First are those accounts which consider unity without reduction. Secondly, there are accounts which support the autonomy of chemistry without reference to some form of emergence.

a. Unity without Reduction

Two philosophers of chemistry have primarily examined chemistry’s relation to quantum mechanics in terms of unity without reduction. First, Needham examines unity without reduction by presenting Pierre Duhem’s ‘scheme’ of ‘unity without reduction’ (Needham 2010: 166). He states that

unity surely does not require reduction, intuitively understood as the incorporation of one theory within another. […] Consistency, requiring the absence of contradiction, and more generally in the sense of the absence of conflicts, tensions and barriers within scientific theory, would provide weaker, though apparently adequate, grounds for unity. (Needham 2010: 163)

According to Duhem’s scheme of unity, ‘(m)icroscopic principles complement macroscopic theory in an integrated whole, with no presumption of primacy of the one over the other’ (Needham 2010: 167). This implies that Duhem’s understanding of unity is incompatible with reductionism in the sense that it rejects that physics is the most fundamental science.

Moreover, Needham argues that positions on unity can be distinguished into four groups:

(i) Unity in virtue of reduction, with no autonomous areas,

(ii) unity in virtue of consistency and not reduction, but still no autonomy because of interconnections,

(iii) unity in virtue of consistency and not reduction, with no autonomous areas, and

(iv) disunity. (Needham 2010: 163-164)

Hettema engages in the discussion of unity with respect to chemistry and evaluates Needham’s scheme of unity (2017). In particular, Hettema takes that the first form of unity assumes a form of ‘reductionism in which derivation is strict and reduction postulates are identities’ (Hettema 2017: 277). Regarding the second form of unity, Hettema argues that it faces certain challenges. For example, in this form of unity ‘the nature of the “interconnections” is (..) not well specified in Needham’s scheme’ (Hettema 2017: 277). Moreover, ‘the theories of chemistry and physics are not as strongly dependent on each other as implied (though not stated) in position (ii) in the scheme’ (Hettema 2017: 277-278). Hettema rejects the third form of unity because it allegedly disregards the ‘idea that one science may fruitfully explain aspects of another’ (Hettema 2017: 278).

As already mentioned, Hettema proposes a novel account of reduction regarding the relation between chemistry and quantum mechanics (see subsection 3d). In the broader context of unity, Hettema takes his account to propose a form of unity that Needham’s scheme does not capture. Specifically, Hettema’s account does not support ‘a form of unity in virtue of reduction with no autonomous areas’ (in line with (i)) because, unlike (i), it does not require strict derivation nor the existence of identity relations between the reduced and reducing theory. Moreover, Hettema’s account does not advocate unity without reduction either. While he acknowledges that his account shares common features with non-reductive accounts of unity in the philosophy of science literature, he maintains that his account proposes a ‘naturalised Nagelian reduction’ (Hettema 2012b: 143).

Interestingly, there are two features that his account allegedly shares with certain non-reductive accounts of unity. First, Hettema takes his account of reduction to be compatible with an understanding of theories as ‘interfield theories’ which ‘use concepts and data from neighbouring fields’ (in line with Darden and Maull 1977) (Hettema 2012b: 160). In this context, absolute reaction rate theory is characterised as an interfield theory ‘where the theories comprising the interfield are in turn reductively connected’ (Hettema 2012b:168). There is no one-to-one relation between the reduced and reducing theory; rather there is a ‘net of theories’ where ‘connective and derivative links of a Nagelian sort exist between all these theoretical approaches’ (Hettema 2012b:168). As a result, the overall reduction of chemistry is specified in terms of a network of different theories that are reductively connected between them (Hettema 2012b:171). Secondly, Hettema takes his account to be compatible with Bokulich’s non-reductive account of ‘interstructuralism’, according to which two theories are related in virtue of the ‘structural continuities and correspondences’ between them (Bokulich 2008: 173; Hettema 2012b: 163). Indeed, Hettema identifies structural continuities in the case of the absolute reaction rate theory (Hettema 2012b: 171).

Lastly, Seifert (2017) advocates unity without reduction, arguing that chemistry and quantum mechanics are unified in a non-reductive manner because they exhibit particular epistemic and metaphysical inter-connections.

b. Pluralism

The autonomy of chemistry from quantum mechanics has been defended without reference to emergence in the form of pluralist accounts. Accounts of pluralism that have not been explicitly investigated with respect to chemistry’s relation to quantum mechanics are not presented here, such as Chang’s (2012). For example, Lombardi and Labarca argue for a ‘Kantian-rooted ontological pluralism’ which is based on Putnam’s account of internalist realism (Lombardi 2014b: 23; see also Lombardi and Labarca 2005; Putnam 1981). They claim that while the epistemological reduction of chemistry is in general rejected in the philosophy of chemistry, the ontological reduction of chemistry is more or less accepted (Lombardi and Labarca 2005: 132-133). They take the acceptance of chemistry’s ontological reduction to imply an antirealist or eliminativist view of chemical ontology and to undermine philosophy of chemistry’s relevance when it comes to investigating metaphysical issues (Lombardi and Labarca 2005: 134). In this context, they argue that a hierarchical view of ontology, where everything is grounded on more fundamental physical entities, should be substituted by a view of the world where ‘different but equally objective theory-dependent ontologies interconnected by nomological, non-reductive relationships’, coexist (Lombardi and Labarca 2005: 146; Lombardi 2014b).

There are various objections against this account of ontological pluralism (Needham 2006; Manafu 2013; Hettema 2014: 195-196; see also Lombardi and Labarca 2006). For example, Manafu argues that Lombardi and Labarca have insufficiently argued for the ‘equal’ reference of concepts that are postulated by different theories. This is because if a theory is reduced to, superseded by, or merely has different theoretical virtues from another theory, then it is not necessary that such a theory employs concepts that actually refer to things that exist (Manafu 2013: 227).

Schummer also argues in favour of a pluralist position. He claims that chemistry’s relation to physics should be understood in accordance to methodological pluralism (2014b). Chemistry and each of its sub-disciplines have distinct subject matters, pose different research questions and employ distinct methods and concepts. Even when it comes to concepts that are employed by both chemistry and physics, such as ‘molecule’ and ‘molecular structure’, Schummer argues that these concepts frequently have different meanings in each of the two disciplines and are employed in the context of radically distinct models, methods and research goals (Schummer 2014b: 260).

6. Conclusion

Given how chemistry’s relation to quantum mechanics has been investigated in the philosophy of chemistry so far, it is possible to draw the following conclusions. First, in the first decades of the 21st century, the philosophy of chemistry persistently argued that chemistry’s relation to quantum mechanics is not a reductive relation, as philosophers and physicists such as Nagel and Dirac commonly supposed. Another point drawn from this analysis is that one cannot correctly spell out the relation between the two sciences unless one takes into account the role of approximations, assumptions, models and idealisations in the two sciences.

Moreover, it is evident that more can be said about chemistry’s relation to quantum mechanics. There is substantial material from the philosophy of science which has not been considered with respect to chemistry and which could contribute to a richer and more accurate understanding of the relation between the two sciences. For example, given the alleged failure of Nagelian reduction, it would be interesting to examine whether a different understanding of epistemic reduction applies to the case of chemistry. Alternative accounts of epistemic reduction that take into account the unique models, idealisations, and practices that the special sciences employ would contribute to formulating a novel understanding of the relation of chemistry with quantum mechanics. Also, it is worth investigating whether chemistry and quantum mechanics are unified in a way that neither requires some form of epistemic or ontological reduction, nor collapses to a strongly emergent or pluralist worldview. Lastly, there are various understandings of pluralism which have not been applied to the case of chemistry and which could further support general accounts of pluralism in the sciences. All in all, more can be said about chemistry’s relation to quantum mechanics which can fruitfully contribute to one’s analysis of reduction, unity, pluralism and emergence.

7. References and Further Reading

  • Arriaga, J. A. J., S. Fortin, and O. Lombardi. 2019, ‘A new chapter in the problem of the reduction of chemistry to physics: The Quantum Theory of atoms in Molecules’, Foundations of Chemistry, 21(1): 125-136
  • Bader, Richard. 1990. Atoms in Molecules: A Quantum Theory (Oxford: Oxford University Press)
  • Bader, R. F. W, and C. F. Matta. 2013. ‘Atoms in molecules as non-overlapping, bounded, space-filling open quantum systems’, Foundations of Chemistry, 15: 253- 276
  • Bensaude-Vincent, Bernadette. 2008. Essais d’histoire et de philosophie de la chimie (Paris: Presses Universitaires de Paris Ouest)
  • Bishop, Robert C. 2006. ‘The Hidden Premise in the Causal Argument for Physicalism’, Analysis, 66: 44-52
  • Bishop, Robert C. 2010. ‘Whence chemistry?’, Studies in History and Philosophy of Modern Physics, 41: 171-177
  • Bishop, R. C., and H. Atmanspacher. 2006. ‘Contextual emergence in the description of properties’, Foundations of Physics, 36(12): 1753-1777
  • Bogaard, Paul A. 1978. ‘The Limitations of Physics as a Chemical Reducing Agent’, PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association, 1978(2): 345-356
  • Bokulich, Alisa. 2008. Reexamining The Quantum-Classical Relation: beyond Reductionism And Pluralism (Cambridge: Cambridge University Press)
  • Broad, C. D.. 1925. The Mind and Its Place in Nature (London: Routledge and Kegan Paul)
  • Causá, M., A. Savin, and B. Silvi. 2014. ‘Atoms and bonds in molecules and chemical explanations’, Foundations of Chemistry, 16(1): 3-26
  • Chang, Hasok. 2012. Is Water H2O? Evidence, Realism and Pluralism (Dordrecht: Springer)
  • Chang, Hasok. 2015. ‘Reductionism and the Relation Between Chemistry and Physics’, in Relocating the History of Science, ed. by T. Arabatzis et al., Boston Studies in the Philosophy and History of Science, Vol. 312 (Dordrecht: Springer) pp. 193-209
  • Chang, Hasok. 2017. ‘What History Tells Us about the Distinct Nature of Chemistry’, Ambix, 64(4): 360-374, DOI: 10.1080/00026980.2017.1412135
  • Darden, L., and N. Maull. 1977. ‘Interfield Theories’, Philosophy of Science, 44(1): 43-64
  • Dirac, Paul. 1929. ‘The quantum mechanics of many-electron systems’, Proceedings of the Royal Society of London, Series A, Containing Papers of a Mathematical and Physical Character, 123(792): 714-733
  • Dizadji-Bahmani, F., Frigg, R., and S. Hartmann. 2010. ‘Who’s Afraid of Nagelian Reduction?’, Erkenntnis, 73: 393-412
  • Fazekas, Peter. 2009. ‘Reconsidering the Role of Bridge Laws in Inter-Theoretical Reductions’, Erkenntnis, 71: 303-322
  • Gavroglu, K., and A. Simões. 2012. Neither Physics nor Chemistry. A History of Quantum Chemistry (Cambridge MA: MIT Press)
  • González, J. C. M., Fortin, S., and O. Lombardi. 2019. ‘Why molecular structure cannot be strictly reduced to quantum mechanics’, Foundations of Chemistry, 21(1): 31-45
  • Goodwin, William. 2013. ‘Quantum Chemistry and Organic Theory’, Philosophy of Science, 80(5): 1159-1169
  • Griffiths, David J. 2005. Introduction to Quantum Mechanics, 2nd edn (USA: Pearson Education International)
  • Hendry, Robin F. 1998. ‘Models and Approximations in Quantum Chemistry’, Poznan Studies in the Philosophy of Science and the Humanities, 63: 123-142
  • Hendry, Robin F. 1999. ‘Molecular Models and the Question of Physicalism’, HYLE, 5: 117-34
  • Hendry, Robin F. 2006. ‘Is there Downwards Causation in Chemistry?’, in Philosophy Of Chemistry: Synthesis of a New Discipline, ed. by Davis Baird, Eric Scerri and Lee McIntyre, Boston Studies in the Philosophy of Science, Vol. 242 (Dordrecht: Springer) pp. 173-189
  • Hendry, Robin F. 2008. ‘Two Conceptions of the Chemical Bond’, Philosophy of Science, 75(5): 909-20
  • Hendry, Robin F. 2010a, ‘Emergence vs. Reduction in Chemistry’, in Emergence in Mind, ed. by Cynthia Macdonald and Graham Macdonald (Oxford: Oxford University Press) pp. 205-221
  • Hendry, Robin F. 2010b. ‘Ontological reduction and molecular structure’, Studies in History and Philosophy of Modern Physics, 41: 183–91
  • Hendry, Robin F. 2012. ‘Reduction, emergence and physicalism’, in Philosophy Of Chemistry, ed. by Andrea Woody, Robin F. Hendry and Paul Needham (Amsterdam: Elsevier) pp. 367–386
  • Hendry, Robin F. 2017. ‘Prospects for Strong Emergence in Chemistry’, in Philosophical and Scientific Perspectives on Downward Causation, ed. by Michele P. Paoletti, and Francesco Orilia (New York: Routledge) pp. 146-63
  • Hendry, R. F., and P. Needham. 2007. ‘Le Poidevin on the Reduction of Chemistry’, The British Journal for the Philosophy of Science, 58(2): 339–53
  • Hettema, Hinne. 2012a. Reducing chemistry to Physics: Limits, Models, Consequences (North Charleston SC: Createspace)
  • Hettema, Hinne. 2012b. ‘The Unity of Chemistry and Physics: Absolute Reaction Rate Theory’, HYLE- International Journal for Philosophy of Chemistry, 18(2): 145-173
  • Hettema, Hinne. 2013. ‘Austere quantum mechanics as a reductive basis for chemistry’, Foundations of Chemistry, 14: 311-326
  • Hettema, Hinne. 2014. ‘Linking chemistry with physics: a reply to Lombardi’, Foundations of Chemistry, 16: 193-200
  • Hettema, Hinne. 2017. The Union of Chemistry and Physics- Linkages, Reduction, Theory Nets and Ontology (Springer International Publishing)
  • Hofmann, James R. 1990. ‘How the Models of Chemistry Vie’, PSA 1990, 1: 405- 419
  • IUPAC. 2014. Compendium of Chemical Terminology: Gold Book, Version 2.3.3, Available at: http://goldbook.iupac.org/pdf/goldbook.pdf [accessed 3/05/2018]
  • Klein, Colin. 2009. ‘Reduction Without Reductionism: A Defence of Nagel on Connectability’, The Philosophical Quarterly, 59(234): 39-53
  • Le Poidevin, Robin. 2005. ‘Missing Elements and Missing Premises: A Combinatorial Argument for the Ontological Reduction of Chemistry’, British Journal of Philosophy of Science, 56: 117-134
  • Llored, Jean-Pierre. 2012. ‘Emergence and quantum chemistry’, Foundations of Chemistry, 14(1): 245–274
  • Llored, Jean-Pierre. 2014. ’Whole- Parts Strategies in Quantum Chemistry: Some Philosophical Mereological Lessons’, Hyle, 20(1): 141-163
  • Llored J. P., and R. Harré. 2014. ‘Developing the mereology of chemistry’, in Mereology and the Sciences, ed. by C. Calosi and P. Graziani (London: Springer) pp. 189-212
  • Lombardi, Olimpia. 2014a. ‘Linking chemistry with physics: arguments and counterarguments’, Foundations of Chemistry, 16(3): 181-192
  • Lombardi, Olimpia. 2014b. ‘The Ontological Autonomy of the Chemical World: Facing the Criticisms’, in Philosophy of Chemistry. Boston Studies in the Philosophy and History of Science, vol. 306, ed. by E. Scerri and L. McIntyre (Dordrecht:Springer)
  • Lombardi O., and M. Labarca. 2005. ‘The Ontological Autonomy of the Chemical World’, Foundations of Chemistry, 7(2): 125-148
  • Lombardi, Olimpia. 2006. ‘The Ontological Autonomy of the Chemical World: A response to Needham’, Foundations of Chemistry, 8: 81-92
  • Lombardi, Olimpia. 2007. ‘The Philosophy of Chemistry as a New Resource for Chemical Education’, Journal of Chemical Education, 84(1): 187-192
  • Manafu, Alexandru. 2013. ‘Internal realism and the problem of ontological autonomy: a critical note on Lombardi and Labarca’, Foundation of Chemistry, 15: 225-228
  • Matta, Chérif F. 2013. ‘Special issue: Philosophical aspects and implications of the quantum theory of atoms in molecules (QTAIM)’, Foundations of Chemistry, 15(3): 245- 251
  • Matta C. F., and R.J. Boyd (ed.). 2007. The Quantum Theory of Atoms in Molecules: From Solid State to DNA and Drug Design, Weinheim: Wiley‐VCH Verlag GmbH & Co. KGaA
  • McLaughlin, Brian. 1992. ‘The Rise and Fall of British Emergentism’, in Emergence or Reduction? Essays on the Prospect of a Non-Reductive Physicalism, ed. by A. Beckerman, H. Flohr and J. Kim (Berlin: Walter de Gruyter) pp. 49-93
  • Mill, John S. 1930. A System of Logic Ratiocinative and Inductive (London: Longmans, Green and Co.)
  • Nagel, Ernest. 1979. The Structure of Science: Problems in the Logic of Scientific Explanation, 3rd edn (Hackett Publishing)
  • Needham, Paul. 1999. ‘Reduction and abduction in chemistry- a response to Scerri’, International Studies in the Philosophy of Science, 13(2): 169-184
  • Needham, Paul. 2006. ‘Ontological Reduction: a comment on Lombardi and Labarca’, Foundations of Chemistry, 8(1): 73-80
  • Needham, Paul. 2009. ‘Reduction and Emergence: A Critique of Kim’, Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition, 146(1): 93-116
  • Needham, Paul. 2010. ‘Nagel’s analysis of reduction: Comments in defence as well as critique’, Studies in History and philosophy of Modern Physics, 41: 163- 170
  • Oppenheim, P., and H. Putnam. 1958. ‘The unity of science as a working hypothesis’, in Minnesota Studies in the Philosophy of Science vol. 2, ed. by H. Feigl et al. (Minneapolis: Minnesota University Press) pp.3-36
  • Palgrave Macmillan Ltd. 2004. Dictionary of Physics (UK: Palgrave Macmillan)
  • Primas, Hans. 1983. Chemistry, Quantum Mechanics and Reductionism, 2nd edn (Berlin: Springer)
  • Putnam, Hilary. 1965. ‘How Not to Talk About Meaning’, in Boston Studies in Philosophy of Science, vol. II, ed. by R.S. Cohen and M. Wartofsky (New York: Humanities press) pp. 206-207
  • Putnam, Hilary. 1981. Reason, Truth and History. Cambridge: Cambridge University Press
  • Ramsey, Jeffry L. 1997. ‘Molecular Shape, Reduction, Explanation and Approximate Concepts’, Synthèse, 111: 233–251
  • Scerri, Eric. 1991. ‘The Electronic Configuration Model, Quantum Mechanics and Reduction’, British Journal for the Philosophy of Science, 42: 309-325
  • Scerri, Eric. 1994. ‘Has Chemistry Been at Least Approximately Reduced to Quantum Mechanics?’, PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association, 1994: 160-170
  • Scerri, Eric. 1998. ‘Popper’s naturalised approach to the reduction of chemistry’, International Studies in the Philosophy of Science, 12(1): 33-44, DOI: 10.1080/02698599808573581
  • Scerri, Eric. 2006. ‘Normative and Descriptive Philosophy of Science and the Role of Chemistry’, in Philosophy of Chemistry: Synthesis of a New Discipline, ed. by Davis Baird, Eric Scerri and Lee McIntyre, Boston Studies in the Philosophy of Science, Vol. 242 (Dordrecht: Springer) pp.119- 128
  • Scerri, Eric. 2007a. ‘Reduction and Emergence in Chemistry- Two Recent Approaches’, Philosophy of Science, 74(5): 920-31
  • Scerri, Eric. 2007b. ‘The Ambiguity of Reduction’, HYLE, 13(2): 67-81
  • Scerri, Eric. 2012a. ‘Top-down causation regarding the chemistry-physics interface: a sceptical view’, Interface Focus, 2: 20-25
  • Scerri, Eric. 2012b. ‘What is an element? What is the periodic table? And what does quantum mechanics contribute to the question?’, Foundations of Chemistry, 14: 69-81
  • Scerri, E., and G. Fisher (ed.). 2015. Essays in the Philosophy of Chemistry (Oxford: Oxford University Press)
  • Schummer, Joachim. 1998. ‘The Chemical Core of Chemistry I: A Conceptual Approach’, HYLE- International Journal for Philosophy of Chemistry, 4: 129-162
  • Schummer, Joachim. 2014a. ‘Editorial: Special Issue on ‘General Lessons from Philosophy of Chemistry’ on the occasion of the 20th Anniversary of HYLE’, HYLE- International Journal for Philosophy of Chemistry, 20: 1-10
  • Schummer, Joachim. 2014b. ‘The Methodological Pluralism of Chemistry and Its Philosophical Implications’, in Philosophy of Chemistry: Growth of a New Discipline, ed. by Eric Scerri E. and Lee McIntyre (Dordrecht: Springer) pp.57-72
  • Seifert, Vanessa A.. 2017. ‘An alternative approach to unifying chemistry with quantum mechanics’, Foundations of Chemistry, 19(3): 209-222
  • Sutcliffe B. T., and R. G. Woolley 2012. ‘Atoms and Molecules in Classical Chemistry and Quantum Mechanics’, in Handbook of the Philosophy of Science. Volume 6: Philosophy of Chemistry, ed. by Robin F. Hendry, Paul Needham and Andrea I. Woody (Amsterdam: Elsevier BV) pp. 387-426
  • Tapia, O. 2006. ‘Can Chemistry Be Derived from Quantum Mechanics? Chemical Dynamics and Structure’, Journal of Mathematical Chemistry, 39(3/4): 637-639
  • van Brakel, Jaap. 1999. ‘On the Neglect of the Philosophy of Chemistry’, Foundations of Chemistry, 1(2): 111-174
  • van Brakel, Jaap. 2000. Philosophy of Chemistry. Between the Manifest and the Scientific Image (Leuven: Leuven University Press)
  • van Brakel, Jaap. 2014. ‘Philosophy of Science and Philosophy of Chemistry’, HYLE- International Journal for Philosophy of Chemistry, 20: 11-57
  • van Riel, Raphael. 2011. ‘Nagelian Reduction beyond the Nagel Model’, Philosophy of Science, 78(3): 353-375
  • Weininger, Stephen J. 1984. ‘The Molecular Structure Conundrum: Can Classical Chemistry Be Reduced to Quantum Chemistry?’, Journal of Chemical Education, 61: 939-944
  • Weisberg, Michael. 2008. ‘Challenges to the Structural Conceptions of Chemical Bonding’, Philosophy of Science, 75: 932–46
  • Wilson, Jessica. 2015. ‘Metaphysical Emergence: Weak and Strong’, in Metaphysics in Contemporary Physics, ed. by Tomasz Bigaj and Christian Wüthrich (Poznan Studies in the Philosophy of the Sciences and the Humanities) pp. 345-402
  • Woolley, R. G. 1976. ‘Quantum theory and molecular structure’, Advances in Physics, 25(1): 27-52
  • Woolley, R. G. 1978. ‘Must a Molecule Have a Shape?’, Journal of the American Chemical Society, 100(4): 1073-1078
  • Woolley, R. G. 1985. ‘The Molecular Structure Conundrum’, Journal of Chemical Education, 62: 1082-1085
  • Woolley, R. G. 1991. ‘Quantum Chemistry Beyond the Born-Oppenheimer Approximation’, Journal of Molecular Structure (Theochem), 230: 17-46
  • Woolley, R. G. 1998. ‘Is there a quantum definition of a molecule?’, Journal of Mathematical Chemistry, 23: 3–12
  • Woolley, R. G., and B. T. Sutcliffe. 1977. ‘Molecular Structure and the Born-Oppenheimer Approximation’, Chemical Physics Letters, 45(2): 393-398
  • Woody, Andrea I. 2000. ‘Putting Quantum mechanics to Work in Chemistry: The Power of Diagrammatic Representation’, Philosophy of Science, 67: S612-S627

Author Information

Vanessa A. Seifert
Email: vs14902@bristol.ac.uk
University of Bristol
United Kingdom

The Axiology of Theism

The existential question about God asks whether God exists, but the axiology of theism addresses the question of what value-impact, if any, God’s existence does (or would) have on our world and its inhabitants. There are two prominent answers to the axiological question about God. Pro-theism is the view that God’s existence does (or would) add value to our world. Anti-theism, by contrast, is the view that God’s existence does (or would) detract from the value of our world. Philosophers have observed that the answer to the axiological question may vary depending on its target and scope. For instance, assessments about God’s value-impact could made from an impersonal perspective without reference to individuals, or from a personal perspective with reference to the value-impact of God only for a particular person or persons. Axiological assessments can also take into account one, some, or all of the purported advantages and downsides of God’s existence.

No general consensus has emerged in the literature regarding the correct answer(s) to the axiological question about God. Some philosophers argue that the answer to the question is obvious, or that the very question itself is unintelligible. For instance, it might be unintelligible to the many theists who hold that if God does not exist then nothing else would exist. So, it is impossible to compare a world with God to a world without God. The most promising argument in support of anti-theism in the literature is the Meaningful Life Argument, which suggests that God’s existence would make certain individuals’ lives worse, for those individuals have life plans so intimately connected with God’s non-existence that, if it turned out God exists, their lives would lose meaning if God were to exist. The most promising argument for pro-theism is best understood as a cluster of arguments pointing to many of the purported advantages of God’s existence including divine intervention (that is, God performing miracles that help people) and the impossibility of gratuitous evil on theism. Additionally, some pro-theists claim that since God is infinitely good that any state of affairs with God is also infinitely good. To date, the literature has focused on comparing the axiological value of theism (especially Christianity) to atheism (especially naturalism). Future work will likely include axiological assessments of the other religious and non-religious worldviews.

Table of Contents

  1. The Axiological Question about God
  2. Is the Axiological Question Intelligible?
  3. Different answers to the Axiological Question
  4. Arguments for Pro-Theism
    1. The Infinite Value Argument
    2. The Morally Good Agents Argument
    3. The Goods of Theism Argument
  5. Arguments for Anti-Theism
    1. The Meaningful Life Argument
    2. The Goods of Atheism Argument
  6. Connections to the Existence of God
    1. Divine Hiddenness
    2. Problem of Evil
    3. Anti-Theism entails Atheism
  7. Future Directions
    1. Exploration of Different Answers
    2. Other Worldviews
  8. References and Further Reading

1. The Axiological Question about God

A perennial topic in the philosophy of religion is the existential question of whether God exists. Arguments in support of theism include the ontological, cosmological, teleological, and moral arguments. Arguments in support of atheism, on the other hand, include the arguments from evil, from no best world, and from divine hiddenness. Many of these arguments and topics have a rich philosophical history and sophisticated versions of them continue to be discussed in the literature. The importance of the existential question is obvious: God’s existence is tied to the truth value of the theistic religions. It is of little surprise, then, that philosophers of religion have spilled so much ink over these topics.

This article does not discuss the existential question of whether God exists. Rather, it will examine the question of the axiological question about the value-impact of God’s existence. Some brief remarks by Thomas Nagel are often credited as the starting point in the literature (Kahane 2011, 679; Kraay and Dragos 2013, 159; Penner 2015, 327). In his book The Last Word, Thomas Nagel quips: “I hope there is no God! I don’t want there to be a God; I don’t want the universe to be like that” (1997, 130). Nagel is an atheist who thinks he is rational in his atheism. He thinks that in light of the evidence, atheism is the correct answer to the existential question about God. Yet here he expresses a desire or preference about the non-existence of God. Reflections on this brief quote from Nagel have led to the emergence of discussion about the axiological question in the philosophy of religion. While it is clear Nagel is expressing a preference, philosophers initially wanted to know whether it could be developed into an axiological position.

One interesting aspect of this question is that it seems to be conceptually distinct from the existential question about God. For instance, it seems perfectly consistent for an atheist who denies that God exists to simultaneously believe that God’s existence would be good, though some have denied this claim (for example, Schellenberg 2018).  It also seems consistent for a theist who is convinced that God exists to hold that there are negative consequences of God’s existence. Finally, it’s worth pointing out that the axiological question has come to be understood as a comparative question about the difference in value between different possible worlds or states of affairs (that is, between God worlds and God-less worlds).

2. Is the Axiological Question Intelligible?

In explaining what the axiological question is asking, Guy Kahane writes in an early and influential piece that

We are not asking theists to conceive of God’s death—to imagine that God stopped existing. And given that theists believe that God created the universe, when we ask them to consider His inexistence we are not asking them to conceive an empty void […] I will understand the comparison to involve the actual world [where God exists] and the closest possible world where [God does not exist] (Kahane 2011, 676).

While this makes clear the relevant comparison that Kahane and others have in view, some have suggested that the axiological question itself is unintelligible (Kahane 2012, 35-37; Mugg 2016). This is based on the fact that on a standard (Lewis/Stalnaker) semantics, counterpossibles are trivially true. God is typically understood as a necessary being. This means that if God exists, then God exists in every possible world (that is, in every possible state of affairs). Given this, the statement ‘God does not exist’ is a counterpossible. Now, consider the following conditional: If God does not exist, then the world would be better (or worse). Given theism, any counterpossible with the antecedent in the previous conditional is trivially true because there is no way that the antecedent could be true while the consequent is false. This is because there is no way for the antecedent to be true on theism. If this worry is correct, then cross-world axiological judgements are uninformative at best, and possibly unintelligible or impossible at worst. Notice that the same applies to atheism if the view in mind has it that there is no possible world in which God exists (that is, necessitarian atheism, the view that God necessarily does not exist).

One approach to this objection suggests that this type of axiological comparison is possible as a result of a process called cognitive decoupling. This occurs when an agent extracts information from a representation and then performs computations on it in isolation. Certain information is ‘screened off’ and thus not used in the reasoning process. Likewise, “[t]hose beliefs that are allowed into the reasoning process, along with suppositions, are ‘cognitively quarantined’ from the subject’s beliefs” (Mugg 2016, 448). Consider:

Bugs Bunny might pick up a hole off the ground and throw it on a wall. It is not metaphysically possible to pick up a hole, but we are able to suppose that Bugs has picked up the whole and recognize that Bugs can now jump through the wall. Thus, we can imagine an impossible state of affairs and make judgments about what would obtain within that state of affairs. In representing the impossible state of affairs, we screen out those beliefs that would lead to outright contradiction (Mugg 2016, 449).

In this context, cognitive decoupling occurs in situations in which, “when considering a counterfactual, subjects can screen out those beliefs that (with the antecedent of the counterfactual) imply contradictions” (Mugg 2016, 449). A theist who holds that God necessarily exists could address the axiological question by engaging in cognitive decoupling. This means that when addressing the axiological question, she ‘screens off’ her belief that God necessarily exists (and conversely, a necessitarian atheist could screen off her belief that God necessarily doesn’t exist). This proposal raises a number of questions, including how we can be confident that we have ‘screened off’ the appropriate beliefs, and also whether the comparison made when engaging in cognitive decoupling is relevantly similar to the real-world comparison needed to answer the axiological question.

Another proposal for dealing with this objection suggests that this worry about counterpossibles arises only when the comparison in question is understood as one between metaphysically possible worlds. But, so the proposal goes, when the relevant comparison is one between epistemically possible worlds, the counterpossible problem doesn’t apply (Mawson 2012; see also Chalmers 2011). After all, the theist who believes that God exists of metaphysical necessity holds that there are no metaphysically possible worlds where God doesn’t exist. But for a state-of-affairs to be epistemically possible for such a theist, she only needs to concede that it could obtain, for all she knows. Thus, the theist just needs to concede that, for all she knows, God may not exist. A helpful analogy comes by way of reflecting on the idea that water is H2O. While there are no metaphysically possible worlds where water is not H20, for all one knows, water is not H20. Hence, there are epistemically possible worlds where water is not H20 (Chalmers 60-62). For all the necessitarian theist knows, atheism is true, while for all the necessitarian atheist knows, theism is true. Thus, regardless of whether the comparison between metaphysically possible worlds is intelligible, the comparison between epistemically possible worlds is perfectly intelligible.

Yet another reply to the counterpossible problem holds that value can intelligibly be assigned to metaphysical impossibilities (Kahane2012, 36-37). For if it is possible to assign a value to a metaphysical impossibility, then perhaps the theist who thinks that atheism is metaphysically impossible could still assign a value to the relevant counterpossibles. Consider, for instance, that a mathematical proof could rightly be called beautiful or elegant even if it turns out to be invalid. Of course, it’s controversial whether it’s appropriate to talk of the beauty of an invalid proof. If such judgments turn out not to be appropriate, then it turns out that many of our value assignments will be apparent, not factual (Kahane 2012, 37). We will think we are making a factual value judgment when it is in fact not.

To conclude this section, it’s worth noting that the literature on the axiology of theism often treats rational preference as supervening on axiological judgments (that are understood to be objective). But it is an open question whether an agent’s rational preference need always correspond to correct axiological judgments. Perhaps it could be rational for an agent to prefer a worse state of affairs to a better one, or to disprefer a better state of affairs to a worse one. Kahane (2011) appears to think this is a genuine possibility. I won’t dwell on this issue, but it’s worth keeping in mind as one explores this topic. We’re now in a position to examine different answers that can be proposed to the axiological question.

3. Different answers to the Axiological Question

While some have attempted to address worries about the intelligibility of the axiological question, many philosophers have simply proceeded directly to attempting to answer the question (presumably because they are either unaware of the problem or implicitly assume that it has a reasonable solution). No consensus as to the correct answer to the axiological question has emerged in the literature (and seems unlikely to anytime soon). What has become clear, though, is that there are a great number of different possible answers one could offer to the axiological question.

The two main general positions that have been taken up in the literature are pro-theism and anti-theism. Pro-theism is, roughly, the view that it would be good if God were to exist. Anti-theism, on the other hand, holds that it would be bad if God were to exist. There are, however, other potential answers which haven’t received as much attention. For instance, the neutralist about the axiological question holds that God’s existence has (or would have) a neutral impact on the value of the world. The quietest holds that the axiological question cannot (in principle) be answered. Finally, the agnostic holds that the axiological question might be answerable, but we are currently unable to answer it. Much more remains to be said about the plausibility of these three latter positions. (For more on these answers see Kraay 2018, 10-18.)

There are numerous specific variants of these answers to the question. There is a difference between personal and impersonal judgements about the axiological question. The former focus on the axiological implications of God’s existence with respect to individual persons, while the latter focuses on such implications without any reference to God’s value-impact on persons. Additionally, there are narrow and broad judgements about the axiological question. The former refers to just one advantage (or downside) of God’s existence (or non-existence), while the latter refers to the axiological consequences of God’s existence or non-existence overall. These judgments – personal/impersonal and narrow/broad–combine to form at minimum sixty possible answers to the axiological question when applied to five general answers stated above. Klaas J. Kraay’s (2018, 9) helpful chart enables us to visualize all of these different possibilities:

Axiological Positions
Pro-Theism Anti-Theism Neutralism Agnosticism Quietism
Impersonal Personal
Narrow Wide Narrow Wide
Theism
Atheism
Agnosticism

The first column contains all of the sub-divisions relevant to pro-theism. The other general answers can subdivided in precisely the same way. Likewise, inasmuch as there are additional general answers to the axiological question to the five offered here, this chart will increase in size. These distinctions are important for a number of different reasons. For instance, later we will see that some have claimed that defending wide personal/impersonal anti-theism is a very difficult, if not impossible task. Another interesting idea that has emerged in the literature thus far is that someone can be a narrow personal anti-theist and a wide personal/impersonal pro-theist (Lougheed 2018c). In other words, someone could hold that it would be a bad thing for her, in certain respects, if God exists, while acknowledging that would be a good thing overall if God exists.

4. Arguments for Pro-Theism

This section outlines three different considerations that speak in favour of pro-theism.

a. The Infinite Value Argument

One argument for pro-theism appeals to the idea that God is infinitely valuable (for discussion see Van Der Veen and Horsten 2013). The thought is that if God is infinitely valuable, then any world with God is infinitely valuable because God exists in every world and confers infinite value on each one. From this it follows that any theistic world is more valuable than an atheistic world (or at least not worse if atheistic worlds can be infinitely valuable). There are at least two areas in need of further development regarding this line of argument. First, more work has to be done to show how God’s infinite value can sensibly be thought to make a world (assuming theism is true) infinitely valuable. There is a vast literature on the divine attributes, but the idea of God’s infinite value has been neglected (at least in the contemporary literature). What is it to say God is infinite? How is an abstract concept, infinity, supposed to accurately describe God’s value? Second, the claim that all theistic worlds have the same infinitely high value appears to violate very basic modal and moral intuitions. Consider two worlds in which God exists, one of which includes a genocide that the other does not. These two worlds are otherwise identical. Surely such a world–all else being equal–is axiologically superior to ours.

b. The Morally Good Agents Argument

The Morally Good Agents Argument is another argument in favour of pro-theism. Here is a thought experiment motivating this argument. Imagine that Carl’s car breaks down on the highway. Carl has no phone to call for help, and he doesn’t know anything about car mechanics. First, consider a case in which Susan, a morally good agent, discovers Carl on the side of the highway and offers help. She calls a tow truck for Carl, and when she discovers Carl doesn’t have his wallet, she pays for the tow herself. Second, consider a case in which no one pulls over to assist Carl. He attempts to flag down cars, but no one stops. While Carl is in poor health he has no choice but to attempt to walk to nearest gas station for assistance. These two cases are designed to show that morally good agents tend to add value to states of affairs. If the point generalizes, then a world with morally good agents is better than one without such agents, all else being equal (Penner and Lougheed 2015, 56).

Now consider two additional scenarios. Imagine that George sees Carl attempting to flag down vehicles. George attempts to pull over in order to assist Carl, but his brakes fail and he crashes into Carl, killing him on impact. Or consider Tom, who sees a truck crash into Carl’s car and then drives away. Carl’s car is now on fire with Carl trapped inside. Tom calls 911 but knows that the paramedics won’t arrive in time to save Carl. Tom tries to open the door to save Carl, but he isn’t strong enough to pry the bent door open. The idea behind these two additional cases is to acknowledge that morally good agents, despite good intentions, don’t necessarily have the power to do good. Of course, this doesn’t apply to God. Since God is all-powerful, God won’t be constrained or unable to add value to states of affairs in ways that other morally good agents might be constrained. Inasmuch as it makes sense to think that morally good agents add value to states of affairs, then God adds value to states of affairs. All else being equal, then, a world with God is better than a world without God (Penner and Lougheed 205, 57-58).

There are a number of objections to this line of argument which attempt to show that not all else is equal. One reason to think God’s existence isn’t valuable (at least for certain individuals) is based on the idea that God violates everyone’s privacy. If God exists, then there is a sense in which God automatically violates our privacy (that is, if God is all-knowing, then God knows all of our mental states/thoughts). Without a justifying reason to violate a person’s privacy, this is an aspect in which God’s existence is a bad thing, for part of what’s involved in people forming trusting relationships with each other is that they choose what information about themselves they reveal. But this type of choice is impossible for individuals to make in the case of God. (The issue of privacy will be discussed further in section 5a below.) The question remains, however, whether this worry, assuming it really is a downside, is enough to outweigh all of the goods associated with theism. Another objection invokes a worry about an inverted moral spectrum. Suppose that what we think is good is actually bad according to God, and vice versa. If this is right, then, while it might still be technically true that God is a morally good agent (and adds value), it would make little sense to think we ought to prefer that God exist (Penner and Lougheed 2015, 68).

c. The Goods of Theism Argument

The Goods of Theism Argument represents a family of arguments (some quite informally expressed) that focus on highlighting specific goods of theism. This style of argument need not deny that there are genuine goods associated with atheism. Rather, the goods identified in connection to theism are taken to outweigh any goods associated with atheism. Also, some might acknowledge that these goods need not make it rational for certain individuals, in certain respects, to prefer theism. But, so the thought goes, these goods do show that theism is better than atheism overall.

Various theistic goods that have been identified in the literature include objective meaning or purpose, an afterlife, and cosmic justice. For perhaps only God can be the source of objective meaning, and without God every human life would ultimately be meaningless (Cottingham 2005, 37-57; Metz 2019, 9-21) In addition, theism is often associated with the existence of an afterlife, which is connected to the idea that God’s existence ensures that there will be final justice. Many who are wronged on earth are not compensated for being wronged. Those who perpetrate evil often seem to go unpunished. However, God’s existence is good because God will ensure that everyone will receive their due. This could be a logical consequence of a perfect being. The pro-theist need not be committed to the specific details of how this good is instantiated (Lougheed 2018a).

Perhaps one of the most important putative advantage of theism is that if God exists, there are no instances of gratuitous evil. For many theists hold that the existence of gratuitous evil is logically impossible if God exists (Kraay and Dragos 2013, 166; McBrayer 2010). This is because God would ensure that evil only occurs to achieve some otherwise unobtainable good or that every victim of evil will receive just compensation. Notice that there is no pressure on the pro-theist to explain how certain apparent instances of gratuitous evil are not in fact gratuitous (though this is a problem when defending the existence of God). For the pro-theist is merely claiming that if God exists, then there is no gratuitous evil. She isn’t claiming that in fact there is no gratuitous evil. That there is no gratuitous evil if God exists appears to be a very strong consideration in favour of pro-theism.

One worry for this general line of argument is about whether the goods mentioned here are goods that only obtain on theism. If it could be shown that these goods obtain on atheism (or other religious and non-religious worldviews) then they would be of little help in demonstrating that a world with God is more valuable than one without God (Kahane 2018). A more pressing worry, however, is not whether these goods also obtain on naturalism, but whether theism is exclusively what’s required for them to obtain. Perhaps a very good, very powerful, very knowledgeable being who is only slightly lesser than God could ensure that all the goods in question obtain. If this is right, then theism isn’t required for these goods to obtain. For even if such a being existed, atheism would technically be true since God does not exist in this scenario. This is one area where it becomes problematic for the axiology of the theism literature to use ‘naturalism’ and ‘atheism’ interchangeably.

5. Arguments for Anti-Theism

This section examines two important arguments for anti-theism.

a. The Meaningful Life Argument

Perhaps the most widely discussed argument for anti-theism is an argument which has come to be known as the Meaningful Life Argument. Guy Kahane is responsible for first gesturing at this argument, and his discussion is what sparked much recent interest in the axiological question about God. Kahane takes his cue from well-known objections to utilitarianism raised by Bernard Williams. Williams argues that utilitarianism is so demanding that it requires individuals to sacrifice things which give them meaning (1981, 14.). The problem, then, is that utilitarianism is so demanding that, to follow it, one’s own life would cease to have meaning (or at least one would have to stop pursuing those things which confer meaning on her life). According to Kahane, his worry about utilitarianism has a parallel in the present context:  he claims that theism might be too demanding in the way that utilitarianism is too demanding. It could require that certain individuals give up things which confer meaning on their lives. Kahane writes:

If a striving for independence, understanding, privacy and solitude is so inextricably woven into my identity that its curtailment by God’s existence would not merely make my life worse but rob it of meaning, then perhaps I can reasonably prefer that God not exist—reasonably treat God’s existence as undesirable without having to think of it as impersonally bad or as merely setting back too many of my interests. The thought is that in a world where complete privacy is impossible, where one is subordinated to a superior being, certain kinds of life plans, aspirations, and projects cannot make sense… Theists sometime claim that if God does not exist, life has no meaning. I am now suggesting that if God does exist, the life of at least some would lose its meaning (Kahane 2011, 691-692).

This is the first statement of the Meaningful Life Argument. Note that these thoughts only defend narrow personal anti-theism: according to this argument, it would be worse, in certain respects and for certain individuals, if it turns out that God exists.

The merits of this argument have been debated. For instance, it has been objected that we are often mistaken about what constitutes a meaningful life (Penner 2015, 335). Consider that we often pursue some end thinking it will fulfill us. But when we achieve that end, we often find we are no more fulfilled than we were before. In other words, we often end up thinking we’ve pursued the wrong end. Since we’re highly fallible with respect to what goods contribute to a meaningful life, then we should not be confident in using such judgements to support personal anti-theism. Others have countered that for this objection to succeed, one would have to deny that the goods Kahane mentions such as independence, understanding, privacy, and solitude could contribute to an individual’s meaningful life (Lougheed 2017). But most of us don’t want to deny that these are goods. Still, it seems likely that there are quantitative and qualitative difference between how these goods are instantiated on theism compared to atheism. It remains to be seen whether such differences can successfully be articulated in a way that successfully answers the objection, and hence personal anti-theism.

Additionally, while it has been observed from the very beginning of the debate over the Meaningful Life Argument that for a good like privacy to successfully be harnessed in support of anti-theism, it needs to shown that it is intrinsically valuable, but little has been said in this regard (Kahane 2011, 684). Something is intrinsically valuable if it is valuable in and of itself. Consider that if privacy is only extrinsically valuable, it might turn out not to matter if God violates our privacy. Something is extrinsically valuable if it is only valuable based on what we can get from it. This means that God always knows where we are, what we are doing, and what we are thinking. Also, consider that this issue is one at the very heart of whether personal forms of anti-theism can be defended. For if the anti-theist and pro-theist both agree that privacy is intrinsically valuable, then in order to defend personal anti-theism, it need only be shown that God violates our privacy (as opposed to also explaining why it matters if our privacy is violated). Thus, providing a case for why goods associated with atheism such as privacy are intrinsically valuable would greatly strengthen the case for narrow personal anti-theism.

Finally, a closely related but less developed argument for anti-theism appeals to considerations about dignity to defend personal anti-theism (Kahane 2011, 688-689). Imagine that parents decide to have a child merely in order for the child to become an accomplished musician, or professional athlete, or simply for more help on the farm. The idea here is that a child should have the freedom to choose their own life path. A parent should support a child in doing this inasmuch as possible (and inasmuch as the life path in question is morally permissible). To have a child in order to fulfill some end other than their own fundamentally violates the dignity of the child. It treats the child as a means rather than solely as an end (Lougheed 2017, 350-351). The parallel case, of course, is supposed to be with respect to God’s relationship with humans. Many theistic traditions hold that humans were created solely to fulfill God’s purposes for them. If this is true, then humans aren’t permitted to pursue their own ends; they are obliged to pursue the ends God has set for them. Hence, the existence of God violates the dignity of humans. The next step in developing this line of argument is to provide more details about the conception of dignity this argument requires in order to be successful (Lougheed 2017, 351).

b. The Goods of Atheism Argument

The Goods of Atheism Argument has emerged after the Meaningful Life Argument, and it is also best understood as a cluster of arguments. It has been observed that goods associated with atheism need not necessarily be connected to meaning in order to justify narrow personal anti-theism. With respect to goods such as privacy, autonomy, and understanding, it has seemed to some that a world without God could be better for certain individuals, at least when only considering those specific goods. For if goods such as privacy and autonomy are intrinsically valuable, then they don’t need to be connected to meaning in order to support personal forms of anti-theism (Lougheed 2018c). Of course, given the many advantages associated with theism (for example, no gratuitous evil), it is difficult see how this line of argument could ever justify broad versions of anti-theism. It also remains an open question whether an individual could value these goods enough to justify personal anti-theism in absence of them being connected to her life pursuits and hence meaning.

6. Connections to the Existence of God

This section explores connections that have been drawn between the axiological question about God and the existential question of whether God exists.

a. Divine Hiddenness

The most work that has been done to connect the axiological and existential questions about God to one another is with respect to the argument from divine hiddenness for atheism. This argument runs roughly as follows. If God exists, then a relationship with God is one of the greatest goods possible. Because of this fact, if God exists there would be no instances of non-culpable, non-resistant, non-belief among those capable of a relationship with God. For belief that God exists is a necessarily requirement for a relationship with God. Yet there appear to be instances of non-culpable, non-resistant, non-belief. Or at the very least, it is more likely that such individuals exist than that God exists. Thus, it’s probable that God doesn’t exist (Schellenberg 2006; 2015)

One line of argument in the literature attempts to demonstrate that reflections on the axiological consequences of theism and atheism can be used to object to arguments from divine hiddenness. Assume that an actual good obtaining is axiologically equivalent to the experience of the same good (even when that good doesn’t actually obtain). This is intuitive when one considers that from a first-person perspective there is no difference between a good actually obtaining and the mere experience of that same good (Lougheed 2018). They’re both experienced in exactly the same way from the first person perspective. Now consider some goods often used to defend personal forms of anti-theism: privacy, independence, and autonomy. The key move in the argument is to suggest that these atheistic goods can be experienced in a theistic world where God is hidden. For example, consider the atheistic good of total and complete privacy. One can experience this good in a world where God hides. Indeed, many devoutly religious individuals sometimes report feeling alone and unable to feel God’s presence. Likewise, in a world where God hides one also gets many theistic goods. Maybe God intervenes and does a miracle to help someone, but the cause of the help is sufficiently unclear. So, it’s possible to doubt that God performed a miracle, and hence possible to doubt that God exists. Therefore, in a world where God hides, one is able to experience atheistic goods and also the theistic goods since they actually obtain. But atheistic goods cannot be experienced in a world where God isn’t hidden. If God’s existence were obvious (along with some of the divine attributes), for example, then one could not ever have the experience of total and complete privacy (even if turns out to be, in some sense, an illusion). Finally, in an atheistic world no theistic goods obtain. Thus, a world where God is hidden is axiologically superior to an atheistic world, but more importantly, it’s also superior to a world where God isn’t hidden. These considerations serve to support that idea that God might hide in order to maximize the axiological value of the world (Lougheed 2018a)

One line of thought attempts to complete the axiological solution to divine hiddenness by showing that theistic goods do indeed obtain in a world where God hides. On the one hand, it’s clear that theistic goods obtain in a world where God hides simply because this is logical consequence of God’s existence. However, on the other hand it’s not clear that the experience of theistic goods such as forming a relationship with God, cosmic justice, or the afterlife is the same in both worlds. Indeed, the experience of such goods might be so different that the axiological assessment of them ought to differ too. At best, then, we aren’t in a good position to tell whether a world where God hides is axiologically superior to a world where God isn’t hidden. This suggests that the axiological solution to divine hiddenness is at best incomplete (Lougheed 2018b).

One objection to the axiological solution to divine hiddenness attempts to show that it’s intelligible to say that many of the goods typically associated with theism can be experienced in a world where God does not exist (even if they don’t actually obtain). For instance, an afterlife and divine intervention are goods that could both be experienced in a world where God doesn’t exist (Hendricks and Lougheed 2019). Also consider that a world in which God doesn’t exist is consistent with there being an extremely powerful being who is only slightly less powerful than God.  This less powerful being could intervene to help humans and bring an afterlife, and so forth. Such a being might not be possible on naturalism, but it is perfectly consistent with atheism. One of the benefits of the discussion of divine hiddenness and the axiology of theism is that it has brought into focus the goods associated with both theism and atheism, along with how we should understand the value of the experience of such goods. It seems that this is just the beginning of such discussions and much more work remains to be done on this topic.

b. Problem of Evil

One version of the problem of evil, known as the evidential (or probabilistic) problem of evil, suggests that if it’s probable that gratuitous evil exists, then it’s probable that God doesn’t exist. This is because the existence of God is taken to be logically incompatible with the existence of gratuitous evil. Some have suggested that if an individual endorses this or related arguments from evil, then she must also endorse pro-theism. This is because if she accepts the problem of evil then she believes that certain world bad-making properties (for example, gratuitous evil) are incompatible with God’s existence. But if God exists, then those bad-making properties would not exist, and hence the world would be better. So, the atheist who endorses the problem of evil as a reason for atheism must, in order to be consistent, also be a pro-theist (Penner and Arbour 2018).

c. Anti-Theism entails Atheism

Finally, some have argued that if anti-theism is true, then atheism is true. Since God is perfectly good, God must always bring about the better over the worse. However, if anti-theism is true, then there are ways in which God doesn’t always bring about the better. But if God doesn’t always bring about the better over the worse then God doesn’t exist. So, the truth of anti-theism implies the truth of atheism. More strongly, it has been suggested that any negative feature associated with theism (for example, a lack of certain types of privacy) is evidence for atheism. This is because it is logically impossible that there be any negative features associated with a God who is omnibenevolent (Schellenberg 2018).

7. Future Directions

As noted, pro-theism and anti-theism are by far the two broad answers to the axiological question that have received the most attention in the literature to date. Given that much of contemporary philosophy of religion is focused on Christian theism, it isn’t surprising that many of the advantages and drawbacks associated with theism are also most clearly associated with typically Christian conceptions of God. In light of this, it seems that minority views deserve more attention in their own right. Additionally, comparative axiological analyses of other religious and non-religious worldviews would further expand the debate.

a. Exploration of Different Answers

As noted earlier, there are at least three additional answers to the axiological question worthy of further consideration. The first is quietism.  One reason to hold quietism was alluded to earlier, in Section 2. The necessitarian theist thinks there are no worlds where God doesn’t exist, and the necessitarian atheist thinks that there are no worlds where God exists. Given these views and given that the axiological question is a question about comparative judgments, one might think that it’s impossible to make the relevant comparison. As mentioned above, one way around this counterpossible worry might be to think of the comparison as one between epistemically possible worlds as opposed to metaphysically possible worlds. Another reason for quietism might be that worlds are somehow fundamentally incommensurable with one another and hence can’t be compared (Kraay 2018, 13). Consider that what makes an apple taste good is wholly different from what makes cheese taste good. It doesn’t make sense to compare them axiologically even though they’re both foods. This is a simple example intended to motivate incommensurability (Kraay 2011; Penner 2014).

The second additional answer to the axiological question is agnosticism. This view holds that while the axiological question is perhaps in principle answerable, we aren’t currently in a good position to discover the answer. Hence, we should suspend judgment about the answer to the axiological question. One way of motivating this view is that scepticism about whether we have all of the relevant information required in order to make cross-world value judgments. Not only that, we might worry that even if we could identify particular good-making and bad-making features of a specific world, that we don’t know how to combine those features so as to discover the overall value of that world. So, the agnostic holds that we aren’t in a good position to make value judgments about worlds, though such judgments are in principle possible (Kraay 2018, 10-11).

The third additional answer to the axiological question is neutralism. This involves the claim that God’s existence does not make an axiological difference to worlds. Perhaps God is valuable but shouldn’t be factored into assessments of world value. Or maybe one believes the axiological values of theism and atheism are precisely identical (Kraay 2018, 14). Quietism, agnosticism, and neutralism are surely not the only additional answers to the axiological question, but they represent a starting place for further research into different perspectives on the axiology of theism.

b. Other Worldviews

While the axiological question has only been asked about theism (and atheism), there is no in-principle reason why it couldn’t also be asked about other religious and non-religious worldviews. Indeed, the name ‘axiology of theism’ gives away the rather narrow focus of the literature so far. And it’s even narrower still in focusing not just on ‘theism’ in general but on ‘monotheism’ in particular. There are numerous ways the current debate could be expanded. For instance, pantheism considers God and the Universe to be one. The axiological question might not make sense with respect to pantheism (or might need to be reconstructed) since world value apart from God makes little since if pantheism is true. Panentheism considers the universe to be a proper part of God and thus suffers from a similar worry. Or consider that on a polytheistic religion such as Hinduism the axiological question can be asked with respect to many different Gods. Many of the different deities of Hinduism each have their own unique axiological value. Furthermore, one can explore whether it makes sense to assess the value of each deity separately or whether they need to be assessed together. Finally, consider that it’s far from clear that there is the concept of evil on Buddhism. At the very least, the Buddhist understanding of evil is quite different from how the Judeo-Christian tradition understands it. This brings into focus the question of whether it’s possible to make objective axiological judgments without somehow depending on the values of what one is supposed to be assessing in the first place. These concerns are raised only to show that the axiological question is quite far-ranging, and that much work remains to be done not only in assessing the value of theism and atheism, but also the values of other religious and non-religious worldviews.

8. References and Further Reading

  • Azadegan, E. (2019) “Antitheism and Gratuitous Evil.” The Heythrop Journal 60 (5): 671-677.
    • Argues that personal anti-theism is a form of gratuitous evil.
  • Cottingham, John. (2005) The Spiritual Dimension: Religion, Philosophy and Human Value. Cambridge: Cambridge University Press.
  • Chalmers, David (2011) “The Nature of Epistemic Space,” in Epistemic Modality Andy Egan and Brian Weatherson (eds) Oxford: Oxford University Press, pp. 60-106.
    • Provides a model of epistemic possibility.
  • Davis, S.T. (2014) “On Preferring that God Not Exist (or that God Exist): A Dialogue.” Faith and Philosophy 31: 143-159.
    • A simply written dialogue discussing different ways of defending both anti-theism and pro-theism.
  • Dumsday, T. (2016) “Anti-Theism and the Problem of Divine Hiddenness.” Sophia 55: 179-195.
  • Hedberg, T., and Huzarevich, J. (2017) “Appraising Objections to Practical Apatheism.” Philosophia 45: 257-276.
  • Hendricks, P. and Lougheed, K. (2019) “Undermining the Axiological Solution to Divine Hiddenness.” International Journal for Philosophy of Religion 86: 3-15.
    • Argues that theistic goods could be experienced in a world where God doesn’t exist.
  • Kahane, G. (2011) “Should We Want God to Exist?” Philosophy and Phenomenological Research 82: 674-696.
    • This is responsible for starting the axiology of theism literature is the first statement of the Meaningful Life Argument for anti-theism.
  • Kahane, G. (2012) “The Value Question in Metaphysics.” Philosophy and Phenomenological Research 85: 27-55.
  • Kahane, G. (2018) “If There Is a Hole, It Is Not God-Shaped.” In Kraay, K. [Ed.] Does God Matter? Essays on the Axiological Consequences of Theism. Routledge, 95-131.
    • Argues that God isn’t required to get many of the theistic goods mentioned by pro-theists.
  • Kraay, K.J. Ed. (2018) Does God Matter? Essays on the Axiological Consequences of Theism. Routledge.
    • This is the only edited collection on the axiological question and contains essays addressing a wide variety of issues from well-known philosophers of religion.
  • Kraay, K.J. (2018). “Invitation to the Axiology of Theism.” In Kraay, K.J.[Ed.] Does God Matter? Essays on the Axiological Consequences of Theism. Routledge, 1-36.
    • An extremely detailed survey chapter of the current debate including helpful prompts for further discussion.
  • Kraay, K.J. (2011) “Incommensurability, Incomparability, and God’s Choice of a World. International Journal for Philosophy of religion 69 (2): 91-102.
  • Kraay, K.J. and Dragos, C. (2013) “On Preferring God’s Non-Existence.” Canadian Journal of Philosophy 43: 153-178.
    • Responsible for identifying many of the more fine-grained answers to the axiological question.
  • Linford, D. and Megill, J. (2018) “Cognitive Bias, the Axiological Question, and the Epistemic Probability of Theistic Belief.” In Ontology of Theistic Beliefs: Meta-Ontological Perspectives. Ed. Mirslaw Szatkowski. Berlin: de Gruyter.
  • Lougheed, K. (2017) “Anti-Theism and the Objective Meaningful Life Argument.” Dialogue 56: 337-355.
    • Defends the Meaningful Life Argument against Penner (2018).
  • Lougheed, K. (2018a) “The Axiological Solution to Divine Hiddenness.” Ratio 31: 331-341.
    • Argues that a world where God hides is more valuable than a world where God’s existence is obvious and a world where God doesn’t exist.
  • Lougheed, K. (2018b) “On the Axiology of a Hidden God.” European Journal for Philosophy of Religion 10: 79-95
    • Argues that we cannot tell whether a world where God hides is more valuable than world where God’s existence is obvious.
  • Lougheed, K. (2018c). “On How to (Not) to Argue for the Non-Existence of God.” Dialogue: Canadian Philosophical Review 1-23.
    • Argues that pro-theism is not easier to defend than anti-theism.
  • Luck, M. and Ellerby, N. (2012) “Should we Want God Not to Exist?” Philo 15: 193-199.
  • Mawson, T. (2012) “On Determining How Important it is Whether or Not there is a God.” European Journal for Philosophy of Religion 4: 95-105.
  • McBrayer, J. (2010). “Skeptical Theism.” Philosophy Compass 5: 611-623.
  • McLean, G.R. (2015) “Antipathy to God.” Sophia 54: 13-24.
  • Metz, T. (2019). God, Soul and the Meaning of Life. Cambridge University Press.
    • An introduction to different theories of what constitutes a meaningful life.
  • Mugg, Joshua (2016) “The Quietist Challenge to the Axiology of God: A Cognitive Approach to Counterpossibles.” Faith and Philosophy 33: 441-460.
    • Applies a theory from the philosophy of mind to solve the worries about whether the axiological question is intelligible.
  • Penner, M.A. (2018) “On the Objective Meaningful Life Argument: A Reply to Kirk Lougheed.” Dialogue 57: 173-182.
    • Replies to Lougheed (2017).
  • Penner, M.A. (2015) “Personal Anti-Theism and the Meaningful Life Argument.” Faith and Philosophy 32: 325-337.
    • Develops Kahane (2011) into a more detailed version of the Meaningful Life Argument for anti-theism, but ultimately rejects it.
  • Penner, M.A. (2014) “Incommensurability, incomparability, and rational world-choice.” International Journal for Philosophy of Religion 75 (1): 13-25.
  • Penner, M.A. and Arbour, B.H. (2018) “Arguments from Evil and Evidence for Pro-Theism.” In Kraay, K.J. [Ed.] Does God Matter? Essays on the Axiological Consequences of Theism. Routledge, 192-202.
  • Penner, M.A. and Lougheed, K. (2015) “Pro-Theism and the Added Value of Morally Good Agents.” Philosophia Christi 17: 53-69.
    • Argues that God’s existence adds value to the world since God is a morally good agent.
  • Rescher, N. (1990) “On Faith and Belief.” In Human Interests. Stanford: Stanford University Press, 166-178.
    • The first time the axiology of God’s existence is explicitly mentioned in the contemporary literature.
  • Schellenberg, J.L. (2006). Divine Hiddenness and Human Reason. Cornell University Press.
    • This book represents the first statement of the argument from divine hiddenness as discussed in the contemporary literature.
  • Schellenberg, J.L. (2015) The Hiddenness Argument: Philosophy’s New Challenge to Belief in God. Oxford University Press.
    • A statement on divine hiddenness intended to be accessible to a wide audience.
  • Schellenberg, J.L. (2018) “Triple Transcendence, the Value of God’s Existence, and a New Route to Atheism.” In Kraay, K.J.[Ed.] Does God Matter? Essays on the Axiological Consequences of Theism. Routledge, 181-191.
  • Van Der Veen, J. and Horsten, L. (2013) “Cantorian Infinity and Philosophical Concepts of God.” European Journal for Philosophy of Religion 5: 117-138.
  • Williams, B. (1981) “Persons, Character and Morality,” in Moral Luck. Cambridge: Cambridge University Press.

Author Information

Kirk Lougheed
Email: philosophy@kirklougheed.com
Concordia University of Edmonton
Canada

John Wisdom (1904-1993)

Between 1930 and 1956, John Wisdom set the tone in analytic philosophy in the United Kingdom. Nobody expressed this better than J. O. Urmson in his Philosophical Analysis: Its Development Between the Two World Wars (1956) where, after Bertrand Russell and Ludwig Wittgenstein, Wisdom is the most frequently quoted philosopher. Wisdom was the leading figure of the Cambridge School of Therapeutic Analysis (which included other thinkers such as B. A. Farrell, G. A. Paul, M. Lazerowitz, and Norman Malcolm); the other major British school of analytic philosophy was that of ordinary language philosophy centered primarily at Oxford University.

Wisdom adopted the positions of both G. E. Moore and Wittgenstein, but he rejected the radical critique of metaphysics levelled by the Wittgenstein-inspired Vienna Circle. In contrast to Wittgenstein, Wisdom was not a philosopher of language: he maintained that most significant philosophical problems originate not with language but, in the first instance, as a result of our encounter with problems of the real world. From this standpoint, Wisdom introduced into analytic philosophy the discourse on the meaning of life and on problems of philosophy of religion. Be this as it may, prior to the appearance of Wittgenstein’s Philosophical Investigations (1953), Wisdom’s published works were read as indicators of the directions that Wittgenstein’s thought was taking following the latter’s return to philosophy in 1929.

By the 1960s, Wisdom’s influence had radically diminished. This was due largely to the ascendancy of exact philosophy of language and analytic metaphysics. This development, together with increasing emphasis on the power of scientific knowledge and its techniques, largely overshadowed the exploration of philosophical puzzles, human understanding (“apprehension”), and techniques of deliberation, which were Wisdom’s three chief theoretical concerns.

Table of Contents

  1. Biography
  2. Interpretation, Analysis, and Incomplete Symbols
    1. Interpretation and Analysis
    2. The Task of Analytic Philosophers
  3. Logical Constructions
    1. The Tasks of Philosophical Analysis
    2. Sketching Versus Picturing
    3. Types of Analysis
    4. Ostentation, Instead of Reference
  4. The Metaphysical Turn
    1. Philosophical Perplexity
    2. Philosophical “Statements” as both Misleading and Illuminating
    3. Descriptive Metaphysics
  5. Other Minds
    1. Philosophical Quasi-Doubts and their Therapy
    2. Contemplating Possibilities
    3. The Logic of Philosophical “Statements”
    4. Therapeutic Analysis
    5. On Certainty
  6. What is Philosophy?
    1. Epistemic Anxiety
    2. No Proofs in Philosophy
    3. Philosophy Explores Puzzles
    4. Philosophy Treats Paradoxes
  7. Philosophy of Religion
    1. Epistemic Attitudes
    2. The Logic of God
    3. The Meaning of Life
  8. References and Further Reading
    1. Primary Sources
      1. Books
      2. Papers
    2. Secondary Sources

1. Biography

(Arthur) John Terence Dibben Wisdom was born to the family of a clergyman in Leyton, Essex, on December 9, 1904. He attended the Aldeburgh Lodge School and the Monkton Combe School in Somerset. In 1921, he became a member of Fitzwilliam House, Cambridge, where he read philosophy and attended lectures by G. E. Moore, C. D. Broad, and J. M. E. McTaggart. Wisdom received his Bachelor of Arts in 1924, after which he worked for five years at the National Institute of Industrial Psychology. In 1929, he married the South African singer Molly Iverson. The couple had a son, Thomas, born in 1932, before separating during the Second World War. Between 1929 and 1934, Wisdom was a Lecturer in the Department of Logic and Metaphysics at the University of St. Andrews and a colleague of G. F. Stout. After the publication of his Interpretation and Analysis (1931) and the series of five articles on “Logical Constructions” (1931-1933), Wisdom was named Lecturer in Philosophy at Cambridge and a Fellow of Trinity College. This afforded him the opportunity to acquire firsthand knowledge of Ludwig Wittgenstein’s philosophical work.

Between 1948 and 1950, Wisdom delivered two series of Gifford Lectures on “The Mystery of the Transcendental” and “The Discovery of the Transcendental” that were never published (Ayers 2004). In 1950, Wisdom married Pamela Elspeth Stain, a painter. From 1950 to 1951, he served as president of the Aristotelian Society. In 1952, he was named Professor in Philosophy at Cambridge. Following his retirement from Cambridge in 1968, Wisdom spent four years teaching at the University of Oregon. Wisdom returned to Cambridge in 1972, and six years later was elected Honorary Fellow of Fitzwilliam College. He died in Cambridge on September 12, 1993.

2. Interpretation, Analysis, and Incomplete Symbols

a. Interpretation and Analysis

In his first book in 1931, Wisdom maintains that interpretation and analysis are two kinds of definition. Interpretation is a one-act paraphrase of a word or a phrase, a presentation of its meaning that remains at the same “level,” as when one links a word to its synonyms. By contrast, analysis “unpacks” the meaning at a deeper level (1931, p. 17). St. Augustine effectively captured the difference between interpretation and analysis in his famed reply to the question “What is time?”: “I know well enough what it is, provided that nobody asks me; but if I am asked what it is and try to explain, I am baffled” (Confessions, Book 11). Wisdom reads Augustine as communicating that he knows the interpretation of “time” but not its analysis. Problems arise because the two forms of definition are often difficult to distinguish in practice since elements of analysis tend to find their way into interpretations, with the result that the two categories sometimes overlap (p. 17).

A central theme in Interpretation and Analysis is Jeremy Bentham’s notion of fictitious entities. According to Bentham:

A fictitious entity is an entity to which, though by the grammatical form of the discourse employed in speaking of it, existence be ascribed, yet in truth and reality existence is not meant to be ascribed. (Bentham 1837, viii. p. 197)

The difference between objects of reality and fictional entities is that the latter are not components of facts. They have, as Bentham put it, only “verbal reality.”

Preserving individual perceptions and corporeal substances in his ontology, Bentham declares all other items “fictitious entities.” Such are the 10 predicaments of Aristotle, but also the color red. Similarly, Wisdom holds that persons, animals, and unicorns are individuals, while events and qualities are not. But concepts like “nations” are both individuals and fictitious entities.

b. The Task of Analytic Philosophers

Following Moore, Wisdom maintains that the business of analytic philosophy is to obtain a clear and precise grasp of a phrase’s meaning. A significant part of Moore’s work consists in trying to find the answer to questions like “What do we mean when we say: ʻThis is a blackboardʼ?” (p. 8). However, following another of his teachers, Broad, Wisdom takes analysis to be only one practice of philosophy. There is also a speculative philosophy, which is fully on par with analytic philosophy. The task of analytic philosophers is to clarify the propositions of speculative philosophy (compare to Broad 1924). Wisdom dedicates to this task a special book, Problems of Mind and Matter (1934a), in which he investigates G. F. Stout’s Mind & Matter (1931), which explores three notions: the “mental,” the “material,” and “psychology.”

Wisdom argues against the claim that language is the subject matter of analytic philosophy. He admits that “one of the best clues to the analysis of facts is the [analysis of the] sentence which expresses it” (1931, p. 64), but he insists that he does not really want to say that every philosophical proposition is bad grammar. In other places, Wisdom is more explicit: “The work of an analytic philosopher is not work on language. Indeed, all his results could be stated in many other systems of symbols” (p. 15) (compare to § 4.2). This point suggests that the findings and formulations of the analytic philosopher might be useful to the special sciences. For example, an analysis of the concept of “rent” can be used in political economy. What analytic philosophers strive for above all is clarity and precision everywhere, not only in philosophy.

3. Logical Constructions

Wisdom discusses his doctrine of logical construction—a term introduced by Bertrand Russell—in a series of five articles that appeared in Mind from 1931 to 1933. The philosophical community for a number of years considered these essays to be “the most wholehearted of all attempts to set out the logical assumptions implicit in philosophical analysisʼ” (Passmore 1966, p. 365).

a. The Tasks of Philosophical Analysis

From what derives the difference between the analytic philosopher and the translator? Wisdom holds that the difference is one of diverse paraphrastic intentions. In the same way in which the statement of the liar does not differ from the statement of the ignorant, the philosopher and the translator often speak the same words, but they intend different things.

That the analytic philosopher’s task closely approximates that of the translator reveals that the philosopher’s aim is not to learn new facts but to acquire a deeper insight into the ultimate structure of the facts. Such analysis is worth doing, in Wisdom’s view, since we may perfectly well know the facts but may possess no knowledge about their essential structure whatsoever (1931-3, p. 169-70) (see § 2.1).

The latter claim is directed, in particular, against the Vienna Circle (compare to Stebbing 1933) inasmuch as, while Wisdom rejects metaphysical entities (for example, sense-data), at the same time he embraces metaphysics as a discipline studying the ultimate meaning, the structure of things.

b. Sketching Versus Picturing

Wisdom rejects the idea of the early Moore that propositions exist. This move appears to follow his reluctance to connect analysis to the world as an ontological entity. Wisdom also rejects Wittgenstein’s statement that “propositions” “picture” facts. This is confirmed by the fact that while “a sentence requires a speaker, a picture… requires an artist” (p. 62). Further justifying this position, he argues that when we write one sentence twice, we write two sentences, while the fact that these sentences “sketch” remains one and the same.

Instead of picturing, Wisdom maintains that language “sketches” facts (p. 56). By the act of “sketching,” one makes each element of a sentence to “name” an element of the fact, while the order of elements in the sentence “shows” the form of the elements of the fact: it shows the “shape” of the fact. Wisdom calls the replacement of the components of facts by elements of the sentence “docketing” (p. 51).

Wisdom assesses sentences on a scale of “good expression” of facts. The sentences that best express a fact feature elements of the same spatial order as the elements of the fact. Importantly enough, the sentences of the ordinary language are not identical with the spatial form of the fact that it is expressing but rather with something from a different logical level that might be derived from spatial form (p. 62). To avoid confusions, Wisdom recommends that when, for example, we report a red patch on this white sheet of paper, we would be more precise if we were to say “this red” instead of “this is red.”

c. Types of Analysis

A fact can be about (can sketch) another fact only if it is of the same order. Wisdom regards a fact to be of the “first order”—that is, its elements qualify as “ultimate elements”—if it is not a fact about a fact: in other words, if it features no element like “community,” or character like “machine,” or any other Benthamite fictitious entities. Wisdom also distinguishes “first derivative” facts: “If one supposes it to be a fact that some object is red, then the first derivative will be the fact that the object is characterized by red” (Urmson 1956, p. 81). The first derivative facts are logical constructions.

Since the ways facts can be about other facts can be of different orders, there are correspondingly different types of analysis. Wisdom discriminates between material, philosophical, and logical analysis (1934b, p. 16). Logical analysis assesses “functors.” Philosophical analysis, by contrast, serves a constructive role, making primary sentences of secondary sentences. Its objective is to render secondary facts ostensive, thereby yielding insight into their structure. Philosophical cognition can be defined as insight into structure, regardless of how one achieves that insight. It employs the method of what Wisdom identifies as “ostentation.”

The scientist undertakes material analyses. Analyses of this sort are even more ostensive than those Wisdom classifies as philosophical. This cannot be a surprise since material analysis is a same-level analysis, philosophical analysis makes a translation into a new level. Despite this clear difference between the two types of analysis, it is a matter of fact that scientists often perform philosophical analysis, while philosophers on their side commonly engage in material analysis, for example, when they attempt to define “good” in naturalistic ethics.

d. Ostentation, Instead of Reference

Philosophers have always made use of the method of “ostentation.” Wisdom sees, for example, Bentham employing it under the guise of “paraphrase,” Russell under the guise of “logical construction” and “incomplete symbol.” Unfortunately, the method has never been analyzed in detail.

Wisdom defines “ostentation” as “a species of substitution” (1933, p. 1) by means of which one more clearly states the facts to which sentences refer. Each meaningful sentence ostensively “locates” facts, albeit with different success. Sentences containing general names, for instance, do not locate facts as successfully as do sentences with individual names.

The importance of the introduction of the notion of ostentation is that with its help, Wisdom avoids resorting to the use of L. S. Stebbing’s “absolute specific sense-qualities” (1933-4, p. 26). While Stebbing believed that the aim of analysis is “to know what precisely there is in the world” (1932-3, p. 65), Wisdom saw the task of analytic philosophy as exploring the ultimate structure of the facts.

4. The Metaphysical Turn

a. Philosophical Perplexity

Between 1934 and 1937, Wisdom regularly attended Wittgenstein’s classes in Cambridge. The impact of this encounter is clearly evident in “Philosophical Perplexity,” where Wisdom proclaims:

I can hardly exaggerate the debt I owe to [Wittgenstein] and how much of the good in this work is his—not only in treatment of this philosophical difficulty and that but in the matter of how to do philosophy. (1936, p. 36 n.)

In the paper, Wisdom underlines his old position that philosophical statements provide no new information. Their point is different from that of the factual propositions. The task of philosophical propositions is:

… the illumination of the ultimate structure of facts, that is the relations between different categories of being or (we must be in the mode) the relations between different sub-languages within a language. (1936, p. 37)

What is new in “Philosophical Perplexity” is the suggested (Wittgensteinian) tolerance toward the opposing claims philosophers make. If, for example, one philosopher maintains that philosophical statements are verbal, and another that they are not verbal, we can affirm that they both are right.

Wisdom pays special attention to the sentences that the neo-positivists dismiss as meaningless. Typical examples of such sentences are: “God exists,” “Humans are immortal,” and “I know what is on in my friend’s mind”—sentences that give rise to traditional philosophical problems. Wisdom insists that it is misleading to call them all “meaningless,” at least because each proposition of this sort exhibits a meaninglessness of different kind (compare to § 5.3). Nonsensical in different respects are propositions such as that two plus three is six and that one can play chess without the queen.

Puzzles of this sort can be solved “by reflecting upon the peculiar manner in which those sentences work,” in other words, by reflecting on their style, not on their subject. Wisdom’s “mnemonic slogan” now is: “It’s not the stuff, it’s the style that stupefies” (p. 38). Foregrounding style as a substantive philosophical concern, Wisdom initiates a move to discriminate between the “content” of a proposition and what we actually want to say with it—its “point.”

b. Philosophical “Statements” as both Misleading and Illuminating

Wisdom maintains that we often cannot say of a philosophical theory why it is false, although we feel that it is theoretically poor. Actually, the philosopher cannot say why a philosophical statement is false, simply because philosophical “statements” are not, properly speaking, statements but rather recommendations for elucidating some matter.

What misleads in philosophical “statements” is, above all, that they have a non-verbal air (compare to § 2.2). Philosophers often maintain, for example, that they can never know what is going on in other minds, as if they are dreaming of a world in which this were possible. This complaint is misleading, argues Wisdom, since it implies likeness that does not exist and conceals likeness that does.

Wisdom further claims that “philosophical theories are illuminating in a corresponding way, namely when they suggest or draw attention to a terminology which reveals likeness and differences concealed by ordinary language” (p. 41). In other words, by struggling with a philosophical puzzle, we can achieve progress alternatively shifting from provocation to resolution (p. 42).

The conclusion Wisdom reaches is that to accept that a theory or a point of view might not only lead one to adopt different theoretical positions but also to acquire a novel cognitive stance of a general kind. Importantly enough, cognitive differences are possible inasmuch as every judgement is also a decision. Even “a man who says that 1 plus 1 makes 2 does not really make a statement,” declares Wisdom, “he registers a decision” (1938, p. 53) (compare to § 5.2).

c. Descriptive Metaphysics

Just as with the propositions of mathematics, and the statements of psychoanalysis, ethics, poetry, and literature, it is difficult to define metaphysical claims. Apparently, metaphysics is closer to logic, understood as a discipline of a priori definitions. This is the conclusion that Moore reached studying Plato and Aristotle and that Russell came to as well in his study of logic and mathematics. Wisdom finds that by contrast with the logician, “the metaphysician looks for the definition of the indefinable” (1938, p. 60). Thus, metaphysics is not a kind of analysis—analysis is a function of logic. Rather, the ends of metaphysics are achieved in a “game of analyses.” When we define metaphysical questions and sentences, we are articulating the goals of play in the game.

To put it otherwise, the metaphysician is not aiming at analysis as such: “What metaphysicians want, or really want, is not definition but description” (p. 65). If we, nevertheless, would like to speak of analysis instead of descriptions in metaphysics, we should stipulate that the metaphysician is striving to analyze the unanalyzable.

5. Other Minds

Over a period of three years, beginning in 1940, Wisdom published a series of eight papers in Mind under the title “Other Minds” (1952a). The publication was the most important philosophical event in Britain during the Second World War, which explains why the opening discussion at the Joint Session of the Aristotelian Society and Mind Association in 1946 was on “Other Minds” at which Wisdom and J. L. Austin presented their positions (compare to Austin 1946).

a. Philosophical Quasi-Doubts and their Therapy

In these papers, Wisdom holds that philosophy is based on ever-recurring doubts. However, when we try to discuss these doubts, they “turn to dust” (1952, p. 6). Why is this? To answer this question, we need to discriminate between natural doubts about some fact of which we have no knowledge, and philosophical doubts. Philosophical doubts are less doubts in the normative sense than concerns over “logical irregularities.”

Wisdom differentiates three kinds of philosophical doubts: (i) Some doubts stem from the infinite corrigibility of statements about people and things, for example, “Smith believes that flowers feel.” (ii) A second sort are “inner-outer doubts.” When assailed by such concerns, we know all the data of a case but nevertheless doubt what is going on “in Smith’s head.” This state of mind figures in circumstances where, for example, we see that a driver stops at red light but do not in fact know whether he sees the red light. (iii) Wisdom’s third class of doubt involves thoughts such as whether a zebra without stripes is still a zebra and whether a man can fulfill a promise by mistake.

Quasi-doubts of these kinds are doubts about predication. They all hinge on the problem of determining whether S is P. Wisdom detects three sources of the problem: (i) Infinity of the criterion of whether S is P. This engenders doubts of the kind evinced by questions such as “Are the taps closed?” and “Is this love?” (ii) A second source is conflict of criteria as to whether S is P. We see this in questions like “Can you play chess without the queen?” and “Are tomatoes fruits or vegetables?” (iii) Wisdom’s third source is hesitation by leap of criteria that determine whether S is P—the “leap” being from the inner to the outer, from the present to the past, from the actual to the potential.

Wisdom takes his position from psychoanalytic therapy, whereby “the treatment is the diagnosis and the diagnosis is the description, the very full description, of the symptoms” (p. 2 n.). The philosophical difficulty is eliminated only when the philosopher himself comprehensively describes his question—not in abstract general terms but narratively, telling stories about them. Wisdom’s conclusion is that ultimately “every philosophical question, when it isn’t half asked, answers itself; when it is fully asked, answers itself” (ibid.). This is the main principle of his therapeutic analysis (compare to § 5.4).

b. Contemplating Possibilities

Wisdom also maintains that instead of speaking of metaphysical doubt, it is more correct to speak of contemplating possibilities (p. 6, 33). When I am pondering a philosophical puzzle “rival images are before me… two alternatives, two possibilities” (p. 14) and, in a process of deliberating on them, I understand the puzzle. Such contemplation aims at judgement, at decision (compare to § 4.2). In fact, “all philosophical doubts are requests for decision” (p. 3 n.), not for information.

As contemplation of possibilities, philosophical knowledge is clearly a priori. According to Wisdom, philosophical knowledge is the most general knowledge, more general than mathematical knowledge. That is why the “ignorance” in philosophy is not bona fide ignorance; the “doubt” in it is not genuine doubt. The philosophical pseudo-ignorance is usually combined with the perfect knowledge of the object. Moreover, observes Wisdom, “to grasp how philosophy though not logic is a priori and though a priori is not logic takes one far towards dissolving its difficulties” (p. 20).

c. The Logic of Philosophical “Statements”

According to Wisdom, the philosophical question is neither a logical proposition nor an empirical warning. It is a question of the form “Aren’t we really all mad?” or an exclamation like “We are all sinners!” Such phrases are requests for notational reform. They are not an appeal for a search of new facts.

Like all conflicts in philosophy, the “conflict between Sceptics and Phenomenalists,” avers Wisdom, “is removed not by proving the one [side] being wrong and the other right, but by investigating certain of the cases of each one’s saying what he does” (p. 56). One can do this by means of “careful description” of the usage of the competing phrases (compare to § 4.3). Wisdom perceives this method as being similar to that of the writers, who blend technique “with the detailed description of the concrete occasion” (p. 57).

Meaningless statements of belief, however, are different in type. This is evident in the contrast between, for example, the statement that in the dead man there is still something alive, and the statement that the clock is moved by a leprechaun, both of which differ typologically from the statement that particular man now exists in a body other than his own. In this connection, Wisdom notes that “there is more difference between the grammar of ʻcurly wolfʼ and ʻpretence wolfʼ than there is between the grammar of ʻcurly wolfʼ and ʻinvisible wolfʼ” (p. 25; compare to p. 68). Moreover, “even within the category of physical objects there are differences in logic” (p. 76 n.), as in how “has legs” relates to “is a chair” differently than to “is a cushion.”

The principle “every sort of statement has its own sort of logic” implies that we cannot decide which among competing metaphysical statements is ultimately the winner (p. 62); there are no final proofs here. The inferences drawn in philosophy are no more than probable; they are true only in “colloquial sense.” As Wisdom explains, we can say “none of these answers will do. There is a step [a decision], and we take it, but goodness knows how [… and this] is not an alternative answer, it is a repetition of the complaint” (ibid.).

d. Therapeutic Analysis

To the uncertainty expressed by the question “How do I know other minds?,” we can reply “By analogy.” This answer, however, as Wisdom points out, is as misleading as it is true; it seems true only initially. In fact, it is just another deceptive “smoother” in that it tranquillizes critical thought, albeit only momentarily. If we say, for instance, that the hippopotamus is a water horse, we must immediately add how this identification misleads.

Wisdom concludes from the foregoing the following thesis of therapeutic analysis:

The whole difficulty [in philosophy] arises like difficulty in a neurotic; the forces are conflicting but nearly equal. The philosopher remains in a state of confused tension unless he makes the [therapeutic] effort necessary to bring them all out by speaking of them and to make them fight it out by speaking of them together. It isn’t that people can’t resolve philosophical difficulties but that they won’t. In philosophy it is not a matter of making sure that one has got hold of the right theory but of making sure that one has got hold of them all. Like psychoanalysis it is not a matter of selecting from all our inclinations some which are right, but of bringing them all to light by mentioning them and in this process creating some which are right for this individual in these circumstances. (p. 124 n.)

e. On Certainty

An argument against the skeptical criticism of the claim “There are invisible leprechauns in the clock” is that we can imagine invisible leprechauns known only by the deity. Apparently, questions like “Are there leprechauns?” are not necessarily meaningless.

Even if we were to see the noumena, this would merely be a visual perception again; thus, as philosophers, we would need to be skeptical about them, too. It turns out that we cannot even imagine true noumena. Wisdom concludes that the skeptic’s statements do not participate in the discourse. In fact:

The sceptic refuses to back anything, saying that everything may lose except Logic which doesn’t. In saying this he appears to back something but he doesn’t. For his own statement can’t lose and doesn’t run. (1952a, p. 102 n.)

Some may claim that we can directly know other minds by telepathy. However, this again is only indirect knowledge—it is not a solution to the problem. To talk, for example, of John seeing literally everything that Smith sees is to speak of one person existing in two bodies. If somehow we all were to have a telepathic connection with Smith’s mind, then his private life would be common and the mind-processes in his head would be physical events.

The notion that we can have knowledge of someone else’s mind is, as Wisdom sees it, absurd. We encounter a logical impossibility here. To say “we can’t know other minds” is in the first instance to acknowledge that this is physiologically impossibly. Once we understand that telepathy, too, cannot be a source of knowing other minds, however, we see that such knowledge is a logical impossibility.

6. What is Philosophy?

a. Epistemic Anxiety

The question “what is philosophy?” plays central role in Wisdom’s works. In a review written in 1943, he maintains that:

… oscillation in deciding between philosophical doctrines goes hopelessly on until one gives up suppressing conflicting voices and lets them all speak their fill. Only then we can modify and reconcile them. (1943, p. 108)

All this provokes in us a feeling of uneasiness, since:

… we are very apt to be dissatisfied with our weighing[;] the weights too often and too much change every reweighing… It is that oscillation which finds expression in [the avowal] “I don’t know what I really want.” (p. 109)

This feeling of epistemic anxiety is most familiar from our experience with moral dilemmas, as on those occasions when we exclaim, “I shouldn’t have done that!” and then, a bit later, we temporize with a remark like, “Well, it isn’t that bad!” Wisdom finds a similar situation when trying to resolve a philosophical issue.

The worst thing, in Wisdom’s conclusion, that we can teach a child is blindly to be driven by a love or hatred that is unchangeable in principle. The pedagogical effort should teach the child to react cautiously and reflectively in different situations. The pupil should be taught to cultivate a broad spectrum of reasoning that he can bring to bear in examining every new development in his environment (compare to Ryle 1979, p. 121). Wisdom explains that the person who best accomplishes this increases the child’s:@

… discrimination not so much of the objects to which he reacts as of his reaction to the objects… Not merely putting something into the child but bringing out the uneasiness which lurks in him. (1952a, p. 110)

b. No Proofs in Philosophy

Wisdom maintains that there cannot be proofs in philosophy—neither in a logical sense nor in an analytic sense. Philosophical proofs are invalid in principle. Indeed, a proof is only possible in complex cases, for example, by algebraic problems, where we have long chains of reasoning. In philosophy, however, the cases we are inclined to consider “proved” are simple. Exactly this is the source of the difficulty: the simpler the case, the more ambiguous are the words of the conclusion. This leads one to contemplate different alternatives and, in the process, to hesitate as to the conclusion. Proofs, however, are free from hesitation per definitionem. There are philosophical questions, not philosophical proofs.

Wisdom maintains that every philosophical question is a request for description of a class of “logical animals”—of a very familiar class of animals. “And because the animals are so familiar there is no question of the answers being wrong descriptions—but only of whether they are happy descriptions or not” (1944b, p. 112).

Entangled philosophical questions introduce new logic. Wisdom understands this to mean that they introduce new ways of seeing things that reveal what is already known in principle but is not before our eyes. Philosophical questions can be likened to the question of a person who is well aware of what a semaphore is but still asks what it is. Obviously this is not a question about facts. Wisdom construes it as a request for a new description, one motivated by the hope that it will eliminate some perplexity. In other words, philosophers exercise deductive reasoning that starts from things that everybody knows (compare to Russell 1914, p. 189ff.).

c. Philosophy Explores Puzzles

In marked disagreement with Wittgenstein, the later Wisdom maintains that “a purely linguistic treatment of philosophical conflicts is often inadequate” (1946a, p. 181). Philosophical puzzles commonly do not, he finds, possess a linguistic etiology (compare to §§ 2.2, 4.2), and they are not different in type from some other unsettling puzzles that confront us in life. The reasonableness employed in philosophical dispute is, says Wisdom, typically of the sort that a woman employs when she decides “which of the two men is the right one for her to marry,” or that a man uses when he must “decide which of two professions is the right one for him to take up” (p. 178).

In fact, the philosopher discusses his problems just as does the businessman, the judge, or the army general does. However, he never approaches his discussions as a preparation for action. The philosopher, declares Wisdom, simply “desires the discussion never to end and dreads its ending.” He is like:

… the man who cannot be sure that he has turned off the tag or the light. He must go again to make sure, and then perhaps he must go again because though he knows the light’s turned off he yet can’t feel sure. (p. 172)

However, in contrast to the neurotic, the philosopher can never resolve his doubts. This is because he does not actually doubt but just pretends to doubt, and he does not pretend merely to others but to himself as well.

Philosophy also resembles logic and mathematics but fields no theories or theorems. Instead, it formulates puzzles, such as those captured in questions like “Can a man do what the other does?” Puzzles of this kind introduce new forms of logic, which the philosopher sifts for hidden characteristic marks of conventional logic. Philosophical puzzles are no less unreal than caricatures; neither do they assert facts. They arise partly from language and partly from our pre-predicative practices.

d. Philosophy Treats Paradoxes

Wisdom’s skeptic claims that we cannot be absolutely sure that, for example, this map represents London. This is true for all statements “about what is so.” When we see a fox head, we can be still not sure that this is a fox’s head. This worry Wisdom dismisses as a product of the logical model of the “man behind the scene [which is…] inappropriate to his logical situation” (1950a, p. 250). What is to be realized when looking at such statements is “how each answer [to a sceptical claim] illuminates what others obscure and obscures what the others illuminate” (p. 254).

It is through a process of asking similar questions and developing answers to them that philosophical problems are resolved. Questions such as “whether the infinite numbers are numbers,” “whether the wild horses are horses,” and “whether a chess game without the queen is a chess game” are all questions of this sort, according to Wisdom, and are requests for judgment (compare to § 7.2). As such discourses reach their terminus, perplexity is replaced by new apprehension, a new “take” on the matter at hand.

Questions of the type “What is this?” are neither inductive nor deductive. Their point differs with different questioners and with different circumstances. Resolving them requires prolonged investigation, which may end in expressions of exasperation, such as “I won’t bother any more with it! I have already thought it over!” Such questions are paradoxical.

Likewise paradoxical, avers Wisdom, are the doctrines of metaphysics, when they are not platitudes. They are “truths which couldn’t but be true” (p. 264), similar to the infinite tautology of absolute skepticism. Usually, they are expressed as paradoxical questions that concern the character of foundations or of knowledge. Metaphysicians approach their questions in terms of general themes, such as things and persons, space and time, good and evil, and so on.

7. Philosophy of Religion

a. Epistemic Attitudes

Wisdom devotes considerable attention to discussing problems of philosophy of religion. His main claim here is that the religious believer and the atheist think about different worlds. “The theist,” he says, “[often] accuses the atheist of blindness and the atheist accuses the theist of seeing what isn’t there” (1944c, p. 158). This difference in attitude determines the difference in seeing different worlds (p. 160).

People with different attitudes see the same facts differently. For example, a married couple may enter a room, and one sense that someone had been there, while the one adamantly deny that there is any clue to substantiate the spouse’s hunch. Most such occurrences are rather a question of feeling than of experience. Wisdom considers it inappropriate in such cases to ask who is right.

Such exercises in reasoning are typically explored in philosophy as well as in religion. However, Wisdom holds that they also have place in some a priori domains of theoretical thinking—in philosophy of mathematics, for example, where two competing parties (say, logicists and constructivists) defend theses, each of them being “right” in their way.

Wisdom’s conclusion, clearly opposing the logic of Gottlob Frege and Russell, is that in such disciplines “the process of argument is not a chain of demonstrative reasoning” (p. 157). Of course, the growth of knowledge in these disciplines is, similarly to that in science, cumulative. However, it starts from several independent premises—not by mechanically iterating the transformation of a set of premises, as in Principia Mathematica.

Wisdom adduces that we can find a solution to a cognitive problem not only by adding new illuminations but also “by talk.” Occasionally, in the process of trying to demonstrate that our opponent is wrong, we become aware that it is we who are mistaken. Often our opponent has unconscious reasons for his attitude, which we should try to make explicit. Such a methodology finds us “connecting and disconnecting” cases, thus “explaining a fallacy in reasoning” (p. 161).

b. The Logic of God

In a 1950 BBC presentation titled “The Logic of God,” Wisdom introduces the example of someone who tries on a new hat and gets the following reaction: “My dear, it’s the Taj Mahal” (1965a). Literally understood, the claim that the hat is a temple is clearly absurd. However, just as absurd is the statement that we can or cannot know other minds. Be this as it may, such claims are not pointless. They simply call, in Wisdom’s view, for a “dialectic process in which they are balanced” (p. 263). Thus, the paradox “We are all mad” should be balanced with its opposite: “We are all sane.” We then arrive at the (quasi-Hegelian) synthesis, “Some of us are mad, but others are not.” Wisdom recommends the same procedure when we address metaphysical problems. Otherwise, we are exposed, he believes, to the threat of the one-sided “road to Solipsism [where] there blows the same wind of loneliness which blows on the road to the house with walls of glass which no one can break” (p. 282).

Wisdom maintains that “sometimes it is worth saying what everybody knows” (1950b, p. 2), in particular, as doing so changes our apprehension of the facts. Such statements do not tell the truth. They reveal it. Indeed, “we sometimes use words neither to give information… nor to express and evoke feelings… but to give greater apprehension of what is before us” (p. 6).

Not all questions have an answer. Among the great unanswerable questions is whether God exists. Wisdom avers that we have only fragmentary evidence for such existence, not proofs. If we want a complete proof here, we should need per impossible to adduce all of God’s characteristics. Similarly, the complete proof of the existence of the rainbow cannot be less complex than all its characteristics.

To substantiate this position, Wisdom refers to his theory of logical models, according to which different kinds of objects have their own logic. For example, the logic of God is much more alien to the logic of electricity, than the logic of milk is to the logic of wine (p. 15). It is more eccentric. A typical characteristic of the logic of God, in contrast to the logic of electricity, is that we have no idea what to expect about its real essence.

There are similar “logics of ignorance.” Thus, the actor may not know exactly how he will act when he assumes the role of his character. He will see that he is getting it wrong only after a first misstep. Conversely, the actor understands that he is on the right track only when his work is complete. Something similar happens when we act in our own character. Euripides, St. Paul, and Sigmund Freud observed how sometimes the agent is not aware that it is not he who performs his deeds. He is governed by his Super-Ego, the logic of which is close (at least for St. Paul) to that of God.

That our knowledge is not only knowledge of facts is attested, Wisdom holds, by the circumstance that, as Freud put it, we do not know even ourselves. We see this in the difficulty we experience when we strive to transcend limited judgments in order to reach some final judgment, or a “divine” judgment, which Wisdom describes as “a judgment which takes everything into account and gives it its correct weight” (1965d, p. 32-3).

c. The Meaning of Life

Wisdom considers the Existentialist movement in philosophy, rather popular on the Continent in the 1950s and 1960s, an evasion, a diversion from the real difficulties of life. He praises it for concentrating on something that only a relatively few philosophers considered worthy of debate in the decades immediately following the Second World War. He charges, however, that the existentialists’ arguments were by and large merely ad rem. It is well known, declares Wisdom, that “one of the best ways of keeping concealed the most horrible is to emphasize the horror of the less horrible and to denigrate the good” (1965c, p. 37).

Against the existentialists, Wisdom insists that despite all the misery in the world, there are situations in which we find complete meaning. He further notes that we can ask “What holds all this up?” but not “What holds up all things?” To be more exact, one cannot answer the question “What is the meaning of all this?” in a single determinate thought or sentence. We find the meaning, on Wisdom’s conception, in many scattered moments of cheerfulness that do not attach to intellectual dishonor, stupidity, or evasion.

Apparently, “What is the meaning of all this?” is not a meaningless question, as the logical positivists maintained. There are many clearly meaningful cases in which one asks “what is the meaning of all this,” as when, for example, the critic tries to grasp the idea of a play. We cannot give only one answer to such questions, though, nor can we supply a fully complete list of the things we believe to be the answer. This, however, does not mean that the words cheat us, as it were, and that such questions cannot be addressed in principle, or that we cannot progress toward an answer. Indeed, opines Wisdom, “the historians, the scientists, the prophets, the dramatists and the poets assist us in our attempts to answer the question of life” (p. 42).

Wisdom concludes that religious issues are also issues of fact (compare to § 6.2). They require new apprehension of facts, in the same way as the court aims at illumination and new apprehension of the facts. To articulate religious propositions is not, according to Wisdom, simply to express an attitude toward life, as the emotivists believe. Nor are such propositions merely matters of intuition or of decision.

8. References and Further Reading

a. Primary Sources

i. Books

  • 1931. Interpretation and Analysis in Relation to Bentham’s Theory of Definition, London: Kegan Paul.
  • 1931-3. Logical Constructions, ed. by J. J. Thomson, New York: Random House, 1969.
  • 1934a. The Problems of Mind and Matter, 2nd ed., Cambridge: Cambridge University Press.
  • 1952a. Other Minds, 2nd ed., Oxford: Blackwell, 1965.
  • 1953a. Philosophy and Psycho-Analysis, Oxford: Blackwell.
  • 1965a. Paradox and Discovery, Oxford: Blackwell.
  • 1991. Proof and Explanation: The Virginia Lectures, ed. by S. F. Barker, Lanham (Maryland): University of America Press.

ii. Papers

  • 1933. “Ostentation,” in (1953a): 1-15.
  • 1934b. “Is Analysis a Useful Method in Philosophy?” in (1953a): 16-35.
  • 1936. “Philosophical Perplexity,” in (1953): 36-50.
  • 1938. “Metaphysics and Verification,” in (1953a): 51-101.
  • 1943. “Critical Notice: C. H. Waddington, and others, Science and Ethics,” in (1953a): 102-111.
  • 1944a. “Moore’s Technique,” in (1953a): 120-148.
  • 1944b. “Philosophy, Anxiety and Novelty,” in (1953a): 112-119.
  • 1944c. “Gods,” in (1953a): 149-168.
  • 1946a. “Philosophy and Psycho-Analysis,” in (1953a): 169-181.
  • 1946b. “Other Minds,” in (1952a): 206-229.
  • 1947. “Bertrand Russell and Modern Philosophy,” in (1953a): 195-209.
  • 1948a. “Note on the New Edition of Professor’s Ayer’s Language, Truth and Logic,” in (1953a): 229-247.
  • 1948b. “Things and Persons,” in (1953a): 217-228.
  • 1950a. “Metaphysics,” in (1952a): 245-65.
  • 1950b. “The Logic of God,” in (1965a): 1-22.
  • 1952b. “Ludwig Wittgenstein, 1934-37,” in (1965a): 87-9.
  • 1953b. “Philosophy, Metaphysics and Psycho-Analysis,” in (1953a): 248-82.
  • 1957. “Paradox and Discovery,” in (1965a): 114-38.
  • 1959. “G. E. Moore,” in (1965a): 82-87.
  • 1961a. “A Feature of Wittgenstein’s Technique,” in (1965a): 90-103.
  • 1961b. “The Metamorphoses of Metaphysics,” in (1965a): 57-81.
  • 1965b. “Religious Belief,” in (1965a): 43-56.
  • 1965c. “Existentialism,” in (1965a): 34-37.
  • 1965d. “Freewill,” in (1965a): 23-33.
  • 1971. “Epistemological Enlightenment,” Proceedings of the American Philosophical Association, 44: 32-44.

b. Secondary Sources

  • Austin, J. L. 1946. “Other Minds,” Philosophical Papers, 2nd ed., Oxford: Oxford University Press, 1970, p. 76-116.
  • Ayers, Michael. 2004. “John Wisdom,” Oxford Dictionary of National Biography, vol. 59, Oxford: Oxford University Press, p. 827-8.
  • Broad, C. D. 1924. “Critical and Speculative Philosophy,” in: J. H. Muirhead (ed.), Contemporary British Philosophy, 1st ser., London: Allen & Unwin, p. 75-100.
  • Flew, Antony. 1978. A Rational Animal, Oxford: Clarendon Press.
  • Milkov, Nikolay, The Varieties of Understanding: English Philosophy Since 1898, New York: Peter Lang, p. 435-521.
  • Moore, G. E. 1917. “The Conception of Reality,” in (1922), p. 197-219.
  • Moore, G. E. 1922. Philosophical Studies, London: Allen & Unwin.
  • Moore, G. E. 1966. Lectures on Philosophy, ed. by C. Lewy, London: Allen & Unwin.
  • Passmore, John. 1966. A Hundred Years of English Philosophy, 2nd ed., Harmondsworth: Penguin (1st ed. 1957).
  • Price, H. H. 1953. Thinking and Experience, London: Hutchinson University Library.
  • Russell, Bertrand. 1914. Our Knowledge of the External World, London: Routledge, 1993.
  • Russell, Bertrand. 1918. “The Philosophy of Logical Atomism”; in: idem, Logic and Knowledge, ed. by R. C. Marsh, London: Allen & Unwin, p. 175-281.
  • Ryle, Gilbert. 1949. The Concept of Mind, Harmondsworth: Penguin (2nd ed.), 1973.
  • Ryle, Gilbert. 1979. On Thinking, Oxford: Blackwell.
  • Stebbing, Susan. 1932-3. “The Method of Analysis in Metaphysics,” Proceedings of the Aristotelian Society 33: 65-94.
  • Stebbing, Susan. 1933. ‘Logical Positivism and Analysis’, Proceedings of the British Academy 19: 53-87.
  • Stebbing, Susan. 1933-4. “Constructions, Proceedings of the Aristotelian Society 34: 1-30.
  • Stout, G. F. 1931. Mind & Matter, Cambridge: Cambridge University Press.
  • Urmson, J. O. 1956. Philosophical Analysis: Its Development between the two World Wars, Oxford: Clarendon Press.
  • Wittgenstein, Ludwig. 1953. Philosophical Investigations, Oxford: Blackwell.
  • Wittgenstein, Ludwig. 2005. The Big Typescript, Oxford: Blackwell.

Author Information

Nikolay Milkov
Email: nikolay.milkov@upb.de
University of Paderborn
Germany

Plato: The Academy

greek_vase Plato’s enormous impact on later philosophy, education, and culture can be traced to three interrelated aspects of his philosophical life: his written philosophical dialogues, the teaching and writings of his student Aristotle, and the educational organization he began, “the Academy.” Plato’s Academy took its name from the place where its members congregated, the Akadēmeia, an area outside of the Athens city walls that originally held a sacred grove and later contained a religious precinct and a public gymnasium.

In the fifth century B.C.E., the grounds of the Academy, like those of the Lyceum and the Cynosarges, the two other large gymnasia outside the Athens city walls, became a place for intellectual discussion as well as for exercise and religious activities. This addition to the gymnasia’s purpose was due to the changing currents in Athenian education, politics, and culture, as philosophers and sophists came from other cities to partake in the ferment and energy of Athens. Gymnasia became public places where philosophers could congregate for discussion and where sophists could offer samples of their wisdom to entice students to sign up for private instruction.

This fifth-century use of gymnasia by sophists and philosophers was a precursor to the “school movement” of the fourth century B.C.E., represented by Antisthenes teaching in the Cynosarges, Isocrates near the Lyceum, Plato in the Academy, Aristotle in the Lyceum, Zeno in the Stoa Poikile, and Epicurus in his private garden. Although these organizations contributed to the development of medieval, Renaissance, and contemporary schools, colleges, and universities, it is important to remember their closer kinship to the educational activities of the sophists, Socrates, and others.

Plato began leading and participating in discussions at the Academy’s grounds in the early decades of the fourth century B.C.E. Intellectuals with a variety of interests came to meet with Plato—who gave at least one public lecture—as well as conduct their own research and participate in dicussions on the public grounds of the Academy and in the garden of the property Plato owned nearby. By the mid-370s B.C.E., the Academy was able to attract Xenocrates from Chalcedon (Dillon 2003: 89), and in 367 Aristotle arrived at the Platonic Academy from relatively far-off Stagira.

While the Academy in Plato’s time was unified around Plato’s personality and a specific geographical location, it was different from other schools in that Plato encouraged doctrinal diversity and multiple perspectives within it. A scholarch, or ruler of the school, headed the Academy for several generations after Plato’s death in 347 B.C.E. and often powerfully influenced its character and direction. Though the Roman general Sulla’s destruction of the Academy’s grove and gymnasium in 86 B.C.E. marks the end of the particular institution begun by Plato, philosophers who identified as Platonists and Academics persisted in Athens until at least the sixth century C.E. This event also represents a transition point for the Academy from an educational institution tied to a particular place to an Academic school of thought stretching from Plato to fifth-century C.E. neo-Platonists.

Table of Contents

  1. The Academy Prior to Plato’s Academy: Sacred Grove, Religious Sanctuary, Gymnasium, Public Park
  2. Athenian Education Prior to Plato’s Academy: Old Education, Sophists, Socrates and his Circle
  3. The Academy in Plato’s Time
    1. Location and Funding
    2. Areas of Study, Students, Methods of Instruction
  4. The Academy after Plato
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. The Academy Prior to Plato’s Academy: Sacred Grove, Religious Sanctuary, Gymnasium, Public Park

In early times, the area northwest of Athens near the river Cephissus was known as the Akadēmeia or Hekadēmeia and contained a sacred grove, possibly named after a hero called Akademos or Hekademos (Diogenes Laertius, Lives and Opinions of Eminent Philosophers III.7-8, cited hereafter as “Lives”). Plutarch mentions a mythical Akademos as a possible namesake for the Academy, but Plutarch also records that the Academy may have been named after a certain Echedemos (Theseus 32.3-4). While the Academy may have been named after an ancient hero, it is also possible that an ancient hero may have been created to account for the Academy’s name.

The Academy was bordered on the east by Hippios Kolonos and to the south by the Kerameikos district, which was famous for its pottery production.  In the late sixth century B.C.E., the Peisistratid tyrant Hipparchus reportedly constructed a public gymnasium in the area known as the Academy (Suda, Hipparchou teichion). This building project, known for its expense, walled in part of the area known as the Academy. Hipparchus probably developed the gymnasium at the Academy to win favor with residents of the Kerameikos district. Like the other major gymnasia outside the city walls, the Lyceum and the Cynosarges, the Academy’s function as a gymnasium operated in tandem with its function as a religious sanctuary.

After Xerxes led the Persians to burn Athens in 480 B.C.E., Themistocles rebuilt the city wall in 478 B.C.E. (Thucydides 1.90), dividing the Kerameikos into an inner Kerameikos and outer Kerameikos. Some time afterwards, Cimon reportedly rebuilt the Academy as a public park and gymnasium by providing it with a water supply, running tracks, and shaded walks (Plutarch, Cimon 13.8).  On the way to the Academy from Athens, one passed from the inner Kerameikos to the outer Kerameikos through the Dipylon gate in the city’s wall; continuing on the road to the Academy, one passed through a large cemetery. Referring to the area of the outer Kerameikos on the way to the Academy, Thucydides writes, “The dead are laid in the public sepulcher in the most beautiful suburb of the city, in which those who fall in war are always buried, with the exception of those slain at Marathon” (Thucydides 2.34.5, trans. Crawley).  Pausanias, writing in the second century C.E., likewise describes the Academy as a district outside of Athens that has graves, sanctuaries, alters, and a gymnasium (Attica XXIX-XXX).  In addition to the shrines, altars, and gymnasium mentioned by Thucydides and Pausanias, there were also gardens and suburban residences in the nearby area (Baltes 1993: 6).

Due to the improvements initiated by Hipparchus and Cimon, the Academy became a beautiful place to walk, exercise, and conduct religious observances. Aristophanes’ The Clouds, first produced in 423 B.C.E., contrasts the rustic beauty of the Academy and traditional education of the past with the chattering and sophistic values of the Agora. Describing the difference, Aristophanes’ “Better Argument” says,

But you’ll be spending your time in gymnasia, with a gleaming, blooming body, not in outlandish chatter on thorny subjects in the Agora like the present generation, nor in being dragged into court over some sticky, contentious, damnable little dispute; no, you will go down to the Academy, under the sacred olive-trees, wearing a chaplet of green reed, you will start a race together with a good decent companion of your own age, fragrant with green-brier and catkin-shedding poplar and freedom from cares, delighting in the season of spring, when  the plane tree whispers to the elm. (1002-1008, trans. Sommerstein)

While The Clouds illustrates that the grounds of the Academy in the 420s had running tracks, a water source, sacred olive groves, and shady walks with poplar, plane, and elm trees, it is not clear whether the Academy was as free of sophistry as Aristophanes presents it, perhaps ironically, in his comedy. At any rate, the Academy was very soon to become a place for intellectual discussion, and its peaceful environment was also headed for disruption by the Spartan army’s occupation of its grounds during the siege of Athens in 405-4 B.C.E.

2. Athenian Education Prior to Plato’s Academy: Old Education, Sophists, Socrates and his Circle

The Greek word for education, paideia, covers both formal education and informal enculturation. Paideia was traditionally divided into two parts: cultural education (mousikē), which included the areas of the Muses, such as poetry, singing, and the playing of instruments, and physical education (gymnastikē), which included wrestling, athletics, and exercises that could be useful as training for battle. Instruction in cultural and physical education was not paid for by public expenditure in the archaic or classical period in Athens, so it was only available to those who could afford it. Education often took place in public places like gymnasia and palestras. During the classical period, writing and basic arithmetic became a basic part of elementary education as well.  In addition to formal education, attendance at religious festivals, dramatic and poetic competitions, and political debates and discussions formed an important part of Athenians’ education. Broadly, an Athenian man educated in the “Old Education” championed by Aristophanes’ “Better Argument” would be familiar with the poetry of Homer and Hesiod, be able to read, write, and count well enough to manage his personal life and participate in the life of the polis, and be cultured enough to appreciate the city’s comic and tragic festivals.

In the fifth century B.C.E., philosophers and sophists came to Athens from elsewhere, drawn by the city’s growing wealth and climate of intellectual activity. Anaxagoras likely came to Athens sometime between 480 and 460 B.C.E. and associated with Pericles, the important statesman and general (Plato, Phaedrus 270a). Parmenides and Zeno came to Athens in the 450s, and sophist Protagoras from Abdera came to Athens in the 430s and also associated with Pericles. Gorgias the rhetorician from Leontini came to Athens in 427 B.C.E., and he taught rhetoric for a fee to Isocrates, Antisthenes, and many others.

Itinerant teachers like Protagoras and Gorgias both supplemented and destabilized the traditional education provided in Athens, as Aristophanes’ comedy The Clouds, the dialogues of Plato, and other sources document. In order to gain paying students, sophists, rhetoricians, and philosophers would often make presentations in public places like the Agora or in Athens’s three major gymnasia, the Academy, the Cynosarges, and the Lyceum. While the accounts of Xenophon and Plato contradict Aristophanes’ comic portrayal of Socrates as a teacher of rhetoric and natural science, the Platonic dialogues do show Socrates frequenting gymnasia and palestras in search of conversation. In the dialogue Euthyphro, Euthyphro associates Socrates with the Lyceum (2a); in the dialogue Lysis, Socrates narrates how he was walking from the Academy to the Lyceum when he was drawn into a conversation at a new wrestling school (203a-204a). Similarly, the Euthydemus presents a conversation between Socrates and two sophists in search of students in a gymnasium building on the grounds of the Lyceum (271a-272e). While Socrates, unlike the sophists, did not take payment or teach a particular doctrine, he did have a circle of individuals who regularly associated with him for intellectual discussion. While the establishment of philosophical schools by Athenian citizens in the major gymnasia of Athens seems to be a fourth-century phenomenon, the Platonic dialogues indicate that gymnasia were places of intellectual activity and discussion in the last decade of the fifth century B.C.E., if not before.

3. The Academy in Plato’s Time

 As noted in the previous section, the Academy, the Lyceum, and the Cynosarges functioned as places for intellectual discussion as well as exercise and religious activity in the fifth century B.C.E. It is likely that the aristocratic Plato spent some of his youth at these gymnasia, both for exercise and to engage in conversation with Socrates and other philosophers. After Socrates’ death in 399 B.C.E., Plato is thought to have spent time with Cratylus the Heraclitean, Hermogenes the Parmenidean, and then to have gone to nearby Megara with Euclides and other Socratics (Lives III.6). Isocrates, student of Gorgias, began teaching in a private building near the Lyceum around 390 B.C.E., and Antisthenes, who also studied with Gorgias and was a member of Socrates’ circle, held discussions in the Cynosarges around that time as well (Lives VI.13). While the Platonic Academy is often seen as the prototype of a new kind of educational organization, it is important to note that it was just one of many such organizations established in fourth-century Athens.

It is likely that Isocrates and Antisthenes established schools of some sort before Plato. Contemporary scholars often assign a founding date for the Academy between the dates of 387 B.C.E. and 383 B.C.E., depending on these scholars’ assessment of when Plato returned from his first trip to Syracuse. Rather than assign a particular date at which the Academy was founded, as though ancient schools possessed formal articles or charters of incorporation (see Lynch 1972), it is more plausible to note that Plato began associating with a group of fellow philosophers in the Academy in the late 390s and that this group gradually gathered energy and reputation throughout the 380s and 370s up until Plato’s death in 347 B.C.E.

a. Location and Funding

Plato was himself from the deme of Collytus, a wealthy district southwest of the Acropolis and within the city walls built by Themistocles. Collytus was a few miles from the Academy, so Plato’s relocating nearby the Academy would have been an important step in establishing himself there.  While some have emphasized the Academy’s remoteness from the Agora (Rihill 2003:174), the six stades (three quarters of a mile) from the Dipylon gate and three more stades from the Agora would not have constituted much of a barrier to anyone interested in seeing the goings on of the Academy in Plato’s time.

In keeping with the Academy’s customary use as a place of intellectual exchange, Plato used its gymnasium, walks, and buildings as a place for education and inquiry; discussions held in these areas were semi-public and thus open to public engagement and heckling (Epicrates cited in Athenaeus, Sophists at Dinner II.59; Aelian, Historical Miscellany 3.19; Lives VI.40). While some scholars have thought that Plato somehow resided in the sacred precinct and gymnasium of the Academy or purchased property there, this is not possible, for religious sanctuaries and areas set aside for gymnasia were not places where citizens (or anyone else) could set up residency. Rather, as Lynch, Baltes, and Dillon have argued, Plato was able to purchase a property with its own garden nearby the sanctuaries and gymnasium of Academy. While much of the Platonic Academy’s business was conducted on the public grounds of the Academy, it is natural that discussions and possibly shared meals would also occur at Plato’s nearby private residence and garden. Given the proximity of Plato’s private residence to the sanctuary and gymnasium of the Academy and the fact that his nearby property and school were both referred to as “the Academy” (Plutarch, On Exile 603b), there has been confusion about the particulars of the physical plant of the Platonic Academy.

Plato was of aristocratic stock and of at least moderate wealth, so he had the financial means to support his life of philosophical study. Following Socrates’ example and departing from the sophists and Isocrates, Plato did not charge tuition for individuals who associated with him at the Academy (Lives IV.2). Still, students at the Academy had to possess or come up with their own sustenance (Athenaeus, Sophists at Dinner IV.168). In addition to receiving funds from either Dion of Syracuse or Anniceris of Cyrene to purchase property near the Academy (Lives III.20), Diogenes Laertius records that Dion paid for Plato’s costs as choregus or chorus leader—a claim also made in Plutarch’s Dion XVII.2)—and purchased Pythagorean philosophical texts for him, and that Dionysus of Syracuse gave him eighty talents (Lives III.3,9). Part of the purpose of Plato’s trips to Syracuse may have been to participate in political reform, but it is also possible that Plato was seeking patrons for the philosophical activity engaged in at the Academy.

While it is probable that Plato associated with other philosophers, including the Athenian mathematician Theaetetus, in the Academy as early as the late 390s (see Nails 2009: 5-6; Nails 2002: 277; Thesleff 2009: 509-518 with Proclus’s Commentary on the First Book of Euclid’s Elements, Book 2, Chapter IV for more details on Theaetetus’s involvement with the Academy), it is the purchase of the property near the Academy after his trip to see Dion in Syracuse that scholars often refer to when speaking of the founding of the Academy in either 387 B.C.E. or 383 B.C.E. While purchase of this property was important to the development of the Platonic Academy, it is important to remember, as Lynch has shown, that Plato’s Academy was not legally incorporated or a juridical entity.  While the wills of Theophrastus (Lives V.52-53) and Epicurus (Lives X.16-17) make provisions for the continuation of their schools and the future control of school property, the will of Plato does not mention the Academy as such (Lives III.41-43). This indicates that while the Platonic Academy was thriving during Plato’s lifetime, it was not essentially linked to any private property possessed by Plato (compare Dillon 2003: 9; see further Nails 2002: 249-250).

b. Areas of Study, Students, Methods of Instruction

 The structure of the Platonic Academy during Plato’s time was probably emergent and loosely organized. Scholars infer from the varied viewpoints of thinkers like Eudoxus, Speusippus, Xenocrates, Aristotle, and others present in the Academy during Plato’s lifetime that Plato encouraged a diversity of perspectives and discussion of alternative views, and that being a participant in the Academy did not require anything like adherence to Platonic orthodoxy. In this way, Plato reflected Socrates’ willingness to discuss and debate ideas rather than the sophists’ claim to teach students mastery of a particular subject matter.  To get a sense of the topics discussed in the Academy, our primary sources are the Platonic dialogues and our knowledge of the persons present at the Academy.

While it is tempting to talk of teachers and students at the Academy, this language can lead to difficulties. While Plato was clearly the heart of the Academy, it is not clear how, if at all, formal status was accorded to members of the Academy. The Greek terms mathētēs (student, learner, or disciple), sunēthēs (associate or intimate), hetairos (companion), and philos (friend), as well as other terms, seem to have been variously used to describe the persons who attended the Academy (Baltes 1993: 10-11; Saunders 1986: 201).

While the precise function of the Platonic dialogues within the Academy cannot be settled, it is practically certain that they were studied and perhaps read aloud by the Academics in Plato’s time. It is also likely that the dialogues were circulated as a way to attract possible students (Themistius, Orations 23.295). As a cursory survey, dialogues like the Republic, Timaeus, and Theaetetus show Plato’s interest in mathematical speculation; the Republic, Statesman, and the Laws attest to Plato’s interest in political theory; the Cratylus, Gorgias, and Sophist show an interest in language, logic, and sophistry, and many dialogues, including the Parmenides, Sophist, and Republic show an interest in metaphysics and ontology. While Plato’s interests were varied and interconnected, the topics of the dialogues reflect topics that Academics were likely to be engaged with.

The array of topics examined in Plato’s dialogues do parallel some of what we know about the philosophical interests of the individuals at the Academy in Plato’s lifetime. Theaetetus of Athens and Eudoxus of Cnidus were mathematicians, and Phillip of Opus was interested in astronomy and mathematics in addition to serving as Plato’s secretary and editor of the Laws. Aristotle, a wealthy citizen of Stagira, came to the Academy in 367 as a young man and stayed until Plato’s death in 347. Aristotle’s twenty-year long participation in the Platonic Academy shows Plato’s openness in encouraging and supporting philosophers who criticized his views, the Academy’s growing reputation and ability to attract students and researchers, and sheds some light on the organization of the Academy. Aristotle reportedly taught rhetoric at the Academy, and it is certain that he researched rhetorical and sophistical techniques there. It is very probable that Aristotle began writing many of the works of his that we possess today at the Academy (Klein 1985: 173), including possibly parts of the biological works, even though biological research based on empirical data is not a line of inquiry that Plato pursued himself. Aristotle’s multiple references to Platonic dialogues in his own works also suggest how the Platonic dialogues were used by students and researchers at the Academy. While most of the pupils at the Platonic Academy were male, Diogenes Laertius lists two female students, Lastheneia of Mantinea and Axiothea of Philius in his list of Plato’s students (Lives III.46-47).

While the Platonic Academy was a community of philosophers gathered to engage in research and discussion around a wide array of topics and questions, the Academy, or at least the individuals gathered there, had a political dimension. Plutarch’s Reply to Colotes claims that Plato’s companions from the Academy were involved in a wide variety of political activities, including revolution, legislation, and political consulting (1126c-d). The various Epistles ascribed to Plato support this view by attesting to Plato’s involvement in the politics of Syrcause, Atarneus, and Assos. While claims that the Academy was an “Organized School of Political Science” or the “RAND Corporation” of antiquity go too far in ascribing formal structure and organization to the Academy, Plato and the individuals associated with the Academy were involved in the political issues of their time as well as purely theoretical discussions about political philosophy.

As noted above, some of the discussions Plato held were on the public grounds of the Academy, while other discussions were held at his private residence. Aristoxenus records at least one poorly received public lecture by Plato on “the good” (Elements of Harmonics II.30), and a comic fragment from Epicrates records Plato, Speusippus, Menedemus, and several youths engaging in dialectical definition of a pumpkin (Athenaeus, Sophists at Dinner 2.59). While it is difficult to reconstruct how instruction occurred at the Academy, it seems that dialectical conversation, lecture, research, writing, and the reading of the Platonic dialogues were all used by individuals at the Academy as methods of philosophical inquiry and instruction.

Although the establishment of the Academy is an important part of Plato’s legacy, Plato himself is silent about his Academy in all of the dialogues and letters ascribed to him. The word “Academy” occurs only twice in the Platonic corpus, and in both cases it refers to the gymnasium rather than any educational organization. One occurrence, already mentioned, is from the Lysis, and it describes Socrates walking from the Academy to the Lyceum (203a). The other occurrence, in the spurious Axiochus, refers to ephebic and gymnastic training (367a) on the grounds of the Academy and does not refer to anything that has to do with Plato’s Academy.

Plato’s silence about the Academy adds to the difficulty of labeling his Academy with the English word “school.” Diogenes Laertius refers to Plato’s Academy as a “hairesis,” which can be translated as “school” or “sect”  (Lives III.41). The noun “hairesis” comes from the verb “to choose,” and it thereby signifies “a choice of life” as much as “a place of instruction.” The head of the Academy after Plato was called the “scholarch,” but while scholē forms the root of our word “school” and was used to refer to Plato’s Academy (Lives IV.2), it originally had the meaning of “leisure.” The Greek word diatribē can also be translated as “school” from its connotation of spending time together, but no matter what Greek term is used, the activities occurring at the Academy during Plato’s lifetime do not neatly map on to any of our concepts of school, university, or college. Perhaps the clearest term to describe Plato’s Academy comes from Aristophanes’ Clouds, written at least three decades before the Academy was established: phrontistērion (94). This term can be translated as “think tank,” a term that may be as good as any other to conceptualize the Academy’s multiple and evolving activities during Plato’s lifetime.

4. The Academy after Plato

In 347 B.C.E. Plato died at the age of approximately eighty years old. According to Diogenes Laertius, Plato was buried in the Academy (Lives III.41). Unlike the claim that Plato purchased property in the sacred precinct of the Academy, this assertion is possible, for the grounds of the Academy were used for burial, shrines, and memorials. At any rate, Pausanias records that in his own time there was a memorial to Plato not far from the Academy (Attica XXX.3).

Although the entrenchment of the words   “academy” and “academic” in contemporary discourse make the persistence of the Platonic Academy seem inevitable, this is probably not how it appeared to Plato or to members of the Academy after his death (Watts 2007: 122). Rather, the Academy continued to develop its sense of identity and plans for persistence after Plato’s death.

One way to develop a partial picture of the Academy after Plato’s death is to review the succession of Academic scholarchs. The chronological succession of scholarchs after Plato, according to Diogenes Laertius, is as follows:

  • Speusippus of Athens, Plato’s nephew, was elected scholarch after Plato’s death, and he held that position until 339 B.C.E.
  • Xenocrates of Chalcedon was scholarch until 314 B.C.E.
  • Polemo of Athens was scholarch of the Academy until 276 B.C.E.
  • Crates of Athens, a pupil of Polemo, was the next scholarch.
  • Arcesilaus of Pitane was scholarch until approximately 241 B.C.E.
  • Lacydes of Cyrene was scholarch until approximately 216 B.C.E.
  • Telecles and Evander, both of Phocaea, succeed Lacydes as dual scholarchs.
  • Hegesinus of Pergamon succeed the dual scholarchs from Phocaea.
  • Carneades of Cyrene succeeded Hegesinus.
  • Clitomachus of Carthage succeeded Carneades in 129 B.C.E.

While Clitomachus is the last scholarch listed by Diogenes Laertius, Cicero provides us with information about Philo of Larissa, with whom he himself studied (De Natura Deorum I.6,17). Philo was a pupil of Clitomachus and was a head of the Academy (Academica II.17; Sextus Empiricus, Outlines of Phyrrhonism I.220). Antiochus of Ascalon, who also taught Cicero, is sometimes considered a head of the Academy (Sextus Empiricus, Outlines of Phyrrhonism I.220-221), but his philosophical position (I.235) and the fact that his school did not meet on the grounds of the Academy (Cicero, De Finibus V.1) makes Antiochus’s school discontinuous with the Platonic Academy.

The terms “Old Academy,” “Middle Academy,” and “New Academy” are used in somewhat different ways by Cicero, Sextus Empiricus, and Diogenes Laertius to describe the changing viewpoints of the Platonic Academy from Speusippus to Philo of Larissa. What seems clear from the various accounts is that, with Arcesilaus, a skeptical edge entered into Academic thinking that persisted through Carneades and Philo of Larissa.

The Mithridatic War of 88 B.C.E. and Sulla’s destruction of the grounds of the Academy and Lyceum as part of the siege of Athens in 86 B.C.E. (Plutarch, Sulla XII.3) mark the rupture between the geographical precinct of the Academy and the lineage of philosophical instruction stemming from Plato that together constitute the Platonic Academy. The destruction of the gymnasium at the Lyceum also marks the end of Aristotle’s peripatetic school (Lynch 1972: 207).

While the Platonic Academy can be said to end with the siege led by Sulla, philosophers including Cicero, Plutarch of Chaeronea, and Proclus continued to identify themselves as Platonists or Academics. In 176 C.E., the Roman Emperor and Stoic philosopher Marcus Aurelius helped continue the influence of Platonic and Academic thought by establishing Imperial Chairs for the teaching of Platonism, Stoicism, Aristotelianism, and Epicureanism, but the holders of these chairs were not associated with the long-abandoned schools that once met on the grounds of the Lyceum or the Academy.

Sometime in the fourth century C.E., a Platonic school was reestablished in Athens by Plutarch of Athens, though this school did not meet on the grounds of the Academy. After Plutarch, the scholarchs of this Platonic school were Syrianus, Proclus, Marinus, Isidore, and Damascius, the last scholarch of this Academy. In 529 C.E. the Christian Roman Emperor Justinian forbade Pagans from publicly teaching, which, along with the Slavonic invasions of 580 C.E. (Lynch 1972: 167), marks an end of the flourishing of Neo-Platonism in Athens.

The Platonic Academy forms an important part of Plato’s intellectual legacy, and analyzing it can help us better understand Plato’s educational, political, and philosophical concerns. While studying the Academy sheds light on Plato’s thought, its history is also invaluable for studying the reception of Plato’s thought and for gaining insight into one of the crucial sources of today’s academic institutions. Indeed, the continued use of the words  “academy” and “academic” to describe educational organizations and scholars through the twenty first century shows the impact of Plato’s Academy on subsequent education.

Today, the area that contains the sacred precinct and gymnasium that housed Plato’s Academy lies within a neighborhood known as Akadimia Platonos. The ruins of the Academy are accessible by foot, and a small museum, Plato’s Academy Museum, helps to orient visitors to the site.

5. References and Further Reading

a. Primary Sources

  • Aelian, (Claudius Aelianus) (2nd-3rd cn. C.E.). Historical Miscellany. Trans. Nigel G. Wilson. Cambridge, MA: Loeb Classical Library, 1997.
    • Chapter XIX of Book 3 of Aelian’s Historical Miscellany is titled “Of the dissention between Aristotle and Plato.” This chapter records a conflict between Plato and Aristotle that has been used to infer that Plato had a private home where he taught in addition to leading conversations on the grounds of the Academy.
  • Aristophanes (c.448-380 B.C.E.). Clouds. Trans. Alan Sommerstein. Warminster: Aris and Phillips, 1991.
    • While written too early to shed light on Plato, this text is crucial for understanding Athenian education, the sophists, and Socrates. It also contains the passage cited above that describes the grounds of the Academy in the 420s.
  • Aristotle (384-322 B.C.E.).
    • The writings of Aristotle are a valuable resource for learning more about the philosophies of some of the individuals that were part of the early Academy. See for example the references to Speusippus in Metaphysics Zeta, Chapter 2, Lambda, Chapter 7, and Mu, Chapter 7; see also the references Euxodus in Metaphysics Alpha, Chapter 8, Lambda, Chapter 8, and Nicomachean Ethics, Book 10, Chapter 2.
  • Aristoxenus of Tarentum (c.370-300 B.C.E.). The Harmonics of Aristoxenus. Ed. and trans. Henry S. Macran. Oxford: Clarendon Press, 1902.
    • Aristoxenus was a student of Aristotle’s and he is an early source for Plato’s public lecture “On the Good.”
  • Athenaneus of Naucratis (2nd-3rd cn. C.E.). The Deipnosophists. In Seven Volumes. Trans. Charles Burton Gluck. Cambridge, MA: Loeb Classical Library, 1951.
    • This lengthy work is a source of much information about antiquity. Scholars of the Academy are particularly drawn to the fragment from Epicrates preserved by Athenaneus that gives a comic presentation of Platonic dialectic.
  • Cicero, Marcus Tullius (106-43 B.C.E.).
    • Cicero’s many writings, including Academia, De Natura Deorum, De Finibus, and Tusculan Disputions contain information about the Academy.
  • Diogenes Laertius (2nd-3rd cn. C.E.). Lives and Opinions of Eminent Philosophers. Two Volumes. Trans. R. D. Hicks. Cambridge, MA: Loeb Classical Library, 1925.
    • Diogenes is an invaluable resource for the lives of ancient philosophers, although he is writing five hundred or so years after the philosophers he describes.
  • Pausanias. (2nd cn. C.E.). Description of Greece. Four Volumes. Trans. W. H. S. Jones. Cambridge, MA: Loeb Classical Library, 1959.
    • Book I of Pausanias’ work deals with Attica; Chapters XXI-XXX shed light on the history of the Academy and how it appeared to Pausanias several centuries later.
  • Philodemus. (c.110-c.30 B.C.E.). Index Academicorum.
    • Philodemus was an Epicurean philosopher who wrote a work on the Platonic Academy. Some fragments of this work have been discovered. For more information, see Blank (2019), below.
  • Plato. Complete Works. Ed. John Cooper. Indianapolis: Hackett, 1997.
    • While the dialogues and letters of Plato do not mention the Platonic Academy, they are an important resource in understanding Plato’s educational and political commitments and activities as well as the educational environment of Athens in the last few decades of the fifth century.
  • Plutarch of Chaeronea (c.45-125 C.E.). Parallel Lives and Moralia.
    • Plutarch’s works are collected in the Loeb Classical Library under Lives (Eleven Volumes) and Moralia (Fifteen Volumes). Particularly valuable for the student of the Academy are Reply to Colotes and Life of Dion, but many of the works found in Plutarch’s corpus shed light on Plato, the Academy, and Platonism.
  • Proclus (412-485 C.E.). A Commentary on the First Book of Euclid’s Elements. Trans. Glenn R. Morrow. Princeton: Princeton University Press, 1970.
    • Book 2, Chapter IV of Proclus’s commentary gives an account of the development of mathematics that includes helpful information about Plato and other members of the Academy. The “Foreword to the 1992 Edition” of Morrow’s translation by Ian Mueller is also helpful to students of Plato’s Academy.
  • Sextus Empiricus (2nd-3rd cn. C.E.). Outlines of Pyrrhonism. Four Volumes. Trans. R. G. Bury. Cambridge, MA: Loeb Classical Library, 1955.
    • As part of his presentation of skepticism, Sextus articulates how skepticism and Academic philosophy differ in Book I, Chapter XXXIII.
  • Suda.
    • The Suda is a tenth-century C.E. Byzantine Greek encyclopedia. The entries on “To Hipparchou teichion,” “Akademia,” and “Platon” were helpful for this article. An online version of the Suda can be accessed at http://www.stoa.org/sol/
  • Themistius (c.317-388 B.C.E.). The Private Orations of Themistius. Trans. Robert J. Penella. Berkeley: University of California Press, 2000.
    • Themistius was a philosopher and senator in the fourth century C.E. who taught in Constantinople. In his 23rd Oration, “The Sophist” he relays that a Corinthian farmer became Plato’s student after he read the Gorgias; Axiotheia had a similar experience reading the Republic, and Zeno of Citium came to Athens after reading the Apology of Socrates.
  • Thucydides (c.5th cn. B.C.E.). The Peloponnesian War. Ed. Robert B. Strassler. Trans. Richard Crawley. New York: Touchstone, 1998.
    • While Thucydides’ work does not shed light on the Academy, he does describe its environs and other aspects of Athenian history that are important for understanding Plato.

b. Secondary Sources

  • Athanassiadi, Polymnia. Damascius. The Philosophical History. Athens: Apamea Cultural Association, 1999.
  • Baltes, Matthias. “Plato’s School, the Academy,” Hermathena, No. 155 (Winter 1993): 5-26.
    • A very clear and well documented portrait of Plato’s Academy.
  • Blank, David, “Philodemus,” The Stanford Encyclopedia of Philosophy (Spring 2019 Edition), Edward N. Zalta (ed.), URL = .
  • Brunt, P. A. “Plato’s Academy and Politics” in Studies in Greek History and Thought. Oxford: Oxford University Press, 1993.
  • Cherniss, Harold. The Riddle of the Early Academy. Berkeley: University of California Press, 1945.
  • Chroust, Anton-Herman. “Plato’s Academy: The First Organizational School of Political Science in Antiquity,” The Review of Politics, Vol. 29, No. 1 (Jan., 1967): 25-40.
  • Dancy, R. M. Two Studies in the Early Academy. Albany: State University of New York Press, 1991.
  • Dillon. John. The Heirs of Plato: A Study of the Old Academy (347-274 BC). Oxford: Clarendon Press, 2003.
    • A study of the Academy with special attention to the philosophies of Plato’s successors.
  • Dillon, John. The Middle Platonists: 80 B.C. to A.D. 220. Ithaca: Cornell University Press, 1996.
  • Glucker, John. Antiochus and the Late Academy. Göttingen: Hypomnemata 56, 1978.
  • Hadot, Pierre. What is Ancient Philosophy? Trans. Michael Chase. Cambridge, MA: Harvard University Press, 2002.
  • Hornblower, Simon and Anthony Spawforth. The Oxford Classical Dictionary. 3rd ed. Oxford: Oxford University Press, 2003.
  • Klein, Jacob. Lectures and Essays. Annapolis: St. John’s College Press, 1985.
  • Lynch, John Patrick. Aristotle’s School: A Study of a Greek Educational Institution. Berkeley: University of California Press, 1972.
    • This work is essential to anyone investigating classical educational institutions.
  • Mintz, Avi. Plato: Images, Aims, and Practices of Education. Cham: Switzerland: Springer, 2018.
  • Nails, Debra. Agora, Academy, and the Conduct of Philosophy. Dordrecht: Kluwer Academic Publishers, 1995.
  • Nails, Debra. The People of Plato: A Prosopography of Plato and Other Socratics. Indianapolis: Hackett Publishing, 2002.
    • This work provides historical context for all of the individuals mentioned in the Platonic dialogues.
  • Nails, Debra. “The Life of Plato of Athens” in A Companion to Plato, edited by Hugh Benson. Malden, MA: Wiley-Blackwell Publishing, 2009.
  • Natali, Carlo. Aristotle: His Life and School. Edited by D. S. Hutchinson. Princeton: Princeton University Press, 2013.
  • Press, Gerald A., ed. The Bloomsbury Companion to Plato. London: Bloomsbury Academic, 2015.
    • A very valuable reference work on Plato. Chapter 1, “Plato’s Life—Historical and Intellectual Context” and Chapter 5, “Later Reception, Interpretation and Influence of Plato and the Dialogues” are particularly valuable for those interested in the history of the Academy.
  • Preus, Anthony. Historical Dictionary of Ancient Greek Philosophy. 2nd edition. Lanham: Rowman & Littlefield Publishers, 2015.
    • This clear and reliable historical dictionary is useful for students of ancient Greek philosophy.
  • Rihill, T. E. “Teaching and Learning in Classical Athens,” Greece & Rome, Vol. 50, No.2 (Oct., 2003): 168-190.
  • Saunders, Trevor J. “‘The Rand Corporation of Antiquity’? Plato’s Academy and Greek Politics” in Studies in Honor of T. B. L. Webster, vol. I, eds. J. H. Betts et al. Bristol: Bristol Classical Press, 1986.
  • Thesleff, Holger. Platonic Patterns: A Collection of Studies. Las Vegas: Parmenides Publishing, 2009.
  • Wareh, Tarik. The Theory and Practice of Life: Isocrates and the Philosophers. Cambridge, MA: Center for Hellenic Studies, 2012.
  • Watts, Edward. “Creating the Academy: Historical Discourse and the Shape of Community in the Old Academy, The Journal of Hellenic Studies, Vol. 127 (2007): 106-122.
    • This article argues that the Old Academy developed in an unplanned fashion and that the Old Academy attempted to craft its identity based on life-style and character as much as doctrine.

Author Information

Lewis Trelawny-Cassity
Email: lcassity@antiochcollege.edu
Antioch College
U. S. A.

James Frederick Ferrier (1808—1864)

James Frederick Ferrier was a mid-nineteenth-century Scottish metaphysician who developed the first post-Hegelian system of idealism in Britain. Unlike the British Idealists in the latter half of the nineteenth century, he was neither a Kantian nor a Hegelian. Instead, he largely develops his idealist metaphysics via his defense of Berkeley and through his rejection of Thomas Reid’s philosophy of common sense. In this way, he is a transitional figure between the philosophy of Enlightenment Scotland and the development of British Idealism in the latter half of the nineteenth century. Ferrier was also the first philosopher in English to refer to the philosophy of knowledge as Epistemology.

The most fully realized version of his metaphysics appears in his Institutes of Metaphysic. For Ferrier, epistemology is primary and must be the starting point for philosophy. His metaphysics depends on the axiom that the minimum unit of cognition involves a synthesis of subject-with-object, which is the absolute in cognition. From here he develops an idealist ontology, which concludes that which really exists is the absolute: some self in union with some object. The central features of his philosophy include the importance of self-consciousness, a rejection of noumena or things-in-themselves, and his theory of ignorance.

Table of Contents

  1. Life and Works
  2. Thought and Writings
    1. Self-consciousness
    2. Reappraisal of Berkeley
    3. Critique of Reid
    4. Idealist Metaphysics
  3. Reception and Influence
  4. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life and Works

Ferrier was born in Edinburgh, Scotland, in 1808. His father, John Ferrier, was a lawyer known as a Writer to the Signet, and his mother was Margaret Wilson. His family was well connected; his uncle, John Wilson (also known as “Christopher North”), was an author and the Professor of Moral Philosophy at Edinburgh University, and his aunt was the novelist Susan Ferrier. Notable figures such as Sir Walter Scott, James Hogg, William Wordsworth, and Thomas De Quincey were acquainted with Ferrier and his family. He began his education in Ruthwell, Dumfriesshire, where he lived with the family of a Rev. Dr. Duncan. He then went to Edinburgh High School, followed by a period at another school in Greenwich. At the age of seventeen, he attended Edinburgh University for two academic sessions from 1825 to 1827. And, then in 1828 he moved to Oxford to study at Magdalen College for his B.A., which he received in 1831. His student life was unexceptional, and he did not show a particular aptitude for philosophy until later in his life.

He returned to Edinburgh after graduation and began a short-lived career in law. It was at this time that he developed his interest in philosophy. In the early 1830s he became friends with the philosopher Sir William Hamilton, and they remained in close contact until Hamilton’s death in 1856. Indicative of his growing interest in German thought, Ferrier traveled to Germany in 1834 where he spent several months in Heidelberg; his awareness of the German Idealists is apparent from the fact that he returned to Scotland with a photograph and a medallion of Hegel. In 1837 he married his cousin Margaret Wilson who was the daughter of his famous uncle “Christopher North.” By all accounts, they had a happy marriage and went on to have five children.

In the late 1830s, Ferrier started to publish articles in philosophy, and this led to his subsequent academic career. In 1842 he gained his first academic chair, becoming the Professor of Civil History at Edinburgh. In 1844-1845 he acted as Hamilton’s substitute in the Chair of Logic and Metaphysics at Edinburgh during the older philosopher’s illness. Then, in 1845, Ferrier moved his family to St. Andrews where he became the Professor of Moral Philosophy and Political Economy. He unsuccessfully attempted to get two Edinburgh Chairs: Moral Philosophy in 1852 and Logic and Metaphysics in 1856. He was unsuccessful in the first case due to sectarian politics and in the latter instance because his metaphysics were considered to be too far from the Scottish philosophy of his predecessors. For this reason, he remained at St Andrews for the remainder of his career. He died in St Andrews in 1863, and he is buried in St Cuthbert’s Churchyard, which is in the city center of Edinburgh.

Ferrier published several articles on literature and philosophy during his lifetime, and many of these were published in Blackwood’s Magazine. Among his articles, there are a few that are particularly indicative of his philosophical interests and eloquent writing style. These are his seven-part series “An Introduction to a Philosophy of Consciousness” (1838-1839), “Berkeley and Idealism” (1842), and “Reid and the Philosophy of Common Sense” (1847). A selection of his collected works appears in three volumes (originally published by Blackwood and Sons in 1875 and republished by Thoemmes Press in 2001). The first volume contains his most significant work, the Institutes of Metaphysic, which was originally published in 1854; here, Ferrier presents a complete system of metaphysics. The contemporary reaction to this was mixed, and Ferrier believed that certain critics, in an attempt to stifle his self-designated “new Scottish philosophy” in favor of the more traditional, or “old Scottish philosophy,” of his predecessors, deliberately misinterpreted his Institutes. Therefore, he subsequently wrote a scathing defense of the Institutes called Scottish Philosophy: The Old and the New (1856) in which he reiterates his arguments in favor of idealism and attacks his critics. A selection from Scottish Philosophy appears as “Appendix” to “Institutes of Metaphysic” in the first volume of his complete works. The second volume contains his lectures on Greek Philosophy, which he worked on in the later years of his life and was published posthumously. The final volume consists of a selection of his articles.

2. Thought and Writings

a. Self-consciousness

A topic that Ferrier concentrates on throughout his philosophical works is self-consciousness, which he generally refers to as “consciousness.” It is: “that notion of self, and that self-reference, which in man generally, though by no means invariably, accompanies his sensations, passions, emotions, play of reason, or states of mind whatsoever” (Ferrier 2001: vol. 3. 40). His focus on self-consciousness is central to his rejection of the Enlightenment goal to develop a “science of human nature.” Further, it forms the basis of his idealism.

He places upmost importance on self-consciousness because he believes that it is the peculiar and defining characteristic of humanity. He contends that things such as sensation and the capacity for reason are not only shared with other animals but they are given by nature; the human being who is subject to them is akin to “a spoke in an unresting wheel. Nothing connected with him is really his. His actions are not his own” (Ferrier 2001: vol. 3. 36). By contrast, consciousness is the act of will through which a thing becomes a person. One is not born conscious, it must be asserted: “The notion of self … is absolutely genetic or creative. Thinking oneself ‘I’ makes oneself ‘I,’ and it is only by thinking himself ‘I’ that a man can make himself ‘I’; or, in other words, change an unconscious thing into that which is now a conscious self” (Ferrier 2001: vol. 3. 109). Prior to consciousness there is no self or personality; without it the human being is a creature of nature that lives for others. Yet, post-consciousness a person’s acts are her own. It follows that consciousness is the precondition for everything that involves a self. In this way, consciousness is required for freedom, responsibility, morality, religion, and conscience.

Moreover, Ferrier explains in “An Introduction to a Philosophy of Consciousness” that a person’s knowledge of the external world depends on an act of negation in which she distinguishes between the self and the not-self. Thus, one becomes aware of the not-self in conjunction with the self. He describes this principle of idealism as “the fundamental act of humanity” (Ferrier 2001: vol. 3. 177). The concomitance of self and other forms the basis of his metaphysics, and it is a topic that he returns to throughout his published works.

In “An Introduction to a Philosophy of Consciousness” he sets out his concerns with contemporary philosophy and calls for a change of focus. His primary target is the Enlightenment goal to develop a “science of human nature.” In his view, this project is impossible because humanity is essentially different from anything else in the world that can be studied. For instance, in astronomy there is a distinction between the subject and the object; the scientist (the subject) is removed from the celestial objects (the objects) that she studies. Yet, in a “science of human nature” the philosopher is at once both the subject and the object. Now, given that self-consciousness is the defining feature of humanity and thereby central to any account of humanity, a problem arises. If the mind is an object of research, the object is deprived of its characteristic feature, namely self-consciousness, which remains with the subject of the research, leaving nothing but “a wretched association machine” (Ferrier 2001: vol. 3. 195). But, if the mind is considered with self-consciousness, then it cannot be properly considered an object of research because the objectivity is lost in so far as the subject and the object are identical. This leads Ferrier to suggest a change of focus for philosophy; instead of the empirical endeavor of a “science of human nature,” he prefers a more metaphysical approach, which is the development of a “philosophy of consciousness.”

In suggesting a “philosophy of consciousness,” Ferrier conceives philosophy as an extension of what people already do. Philosophy and self-consciousness are different only in degree and not in kind. Philosophy is a systematic and elevated self-consciousness, whereas self-consciousness is unsystematic and informal philosophy. He describes it as follows: “Consciousness is philosophy nascent; philosophy is consciousness in full bloom and blow … thus all conscious men are to a certain extent philosophers, although they may not know it” (Ferrier 2001: vol. 3. 197).

b. Reappraisal of Berkeley

Later in the nineteenth century, the British Idealists such as T. H. Green, F. H. Bradley, and Edward Caird were influenced by Kant and the German Idealists. Ferrier was aware of the German philosophers, but his own idealism does not appear to be directly influenced by them. Nonetheless, he was the first Scottish philosopher to seriously consider them. Thomas de Quincey said that: “he was introduced, as if suddenly stepping into an inheritance, to a German Philosophy refracted through an alien Scottish medium” (The Testimonials of J.F. Ferrier 1852, p.22). His friend and mentor, Hamilton, attempted to synthesize the commonsense philosophy deriving from Reid with the transcendental realism of Kant. Ferrier separates himself from Kant (and by extension also from Hamilton) by rejecting the existence of noumena or thing-in-themselves in the absence of percipient beings. He considers the German Idealists in a more favorable light, and he wrote biographical entries on both Schelling and Hegel for the Imperial Dictionary of Philosophy (see Ferrier 2001: vol. 3. 545-568). He also makes the occasional reference to Fichte, Schelling, and Hegel in his published works; in general, he views them positively, while depicting Hegel as an opaque genius. For instance, he says:

whatever truth there may be in Hegel, it is certain that his meaning cannot be wrung from him by any amount of mere reading, any more than the whisky which is in bread … can be extracted by squeezing a loaf into a tumbler. He requires to be distilled, as all philosophers do, more or less—but Hegel to an extent which is unparalleled. A much less intellectual effort would be required to find out the truth for oneself than to understand his exposition of it. (Ferrier 2001: vol. 1. 96)

Yet, the most important idealist influence for Ferrier was the Irish philosopher Berkeley: “we are disposed to regard [Berkeley] as the greatest metaphysician of his own county (we do not mean Ireland; but England, Scotland, and Ireland) at the very least” (Ferrier 2001: vol. 3. 458). Indeed, Ferrier, along with his contemporary Alexander Campbell Fraser, can be credited with reviving Berkeley’s philosophy in the nineteenth century. Ferrier refers to Berkeley on numerous occasions throughout his published works, and in “Berkeley and Idealism” he provides an argument for idealism that is developed out of his reaction to Berkeley. First, he defends Berkeley from the accusation that he denies the existence of the external world. Second, he expands on an idealist conception of non-existence, which is something that he believes that Berkeley has overlooked.

Berkeley shared Locke’s belief that ideas are the immediate objects of the mind. However, he rejected Locke’s view that ideas represent real things, and that real things are the indirect objects of the mind. Berkeley argued that ideas are the real things and that there is nothing beyond them. Thus, for Berkeley, the mind directly knows reality. His conclusion that ideas are real things led many to conclude that Berkeley denied the existence of material objects (for instance, see Leibniz, Samuel Johnson, and Reid). Yet, Ferrier strongly rejects the widespread belief that Berkeley denies the existence of matter. He argues that Berkeley readily accepts the existence of matter in the ordinary understanding of such; the external world consists of solid extended bodies that are perceived by the senses. However, he allows that Berkeley denies the existence of the world in itself, a world beyond perceivers. Ferrier emphasizes that what Berkeley wants to show is that reality is as it appears to perceivers; it is the immediate object of perceptions. He denies the existence of intermediate entities between the perceiver and reality and instead argues that that which is perceived is that which exists. In connection with this, Ferrier supports another aspect of Berkeley’s epistemology, specifically, his contention that primary and secondary qualities are akin in so far as each depends on perceivers and provide information about reality. Neither primary nor secondary qualities denote anything more objective about reality; reality is that which is perceived and both primary and secondary qualities are perceived.

Berkeley considered his own philosophy to be in line with common sense and Ferrier agrees. According to Ferrier, it is Berkeley rather than Reid who is “the champion of common sense” (Ferrier 2001: vol. 3. 301). Berkeley’s idealism places the mind in direct contact with reality; there are no intermediate entities. And, this, Ferrier suggests, is in line with the experience of ordinary people who do not distinguish between the perceptions of objects and the objects themselves. It is the notion of thing-in-themselves, or of a world that exists independently of perceivers that is at odds with common sense. Berkeley’s idealism, by contrast, is in accordance with common sense.

On the one hand, Ferrier describes Berkeley as “the champion of common sense.” On the other hand, he says that the significance of Berkeley’s philosophy is that he provides the basis for absolute idealism. He says:

[Berkeley] was the first to stamp the indelible impress of his powerful understanding on those principles of our nature, which, since his time, have brightened into imperishable truths in the light of genuine speculation. His genius was the first to swell the current of that mighty stream of tendency towards which all modern meditation flows, the great gulf-stream of Absolute Idealism. (Ferrier 2001: vol. 3. 293)

For Ferrier, common sense and absolute idealism are complementary. According to Ferrier, when “genuine idealism” is “instructed by the unadulterated dictates of common sense” it is indistinguishable from “genuine unperverted realism” (Ferrier 2001: vol. 3. 309).

His admiration for Berkeley is clear and he says: “Among all philosophers, ancient or modern, we are acquainted with none who presents fewer vulnerable points than Bishop Berkeley” (Ferrier 2001: vol. 3. 291). Nevertheless, he acknowledges that there is a weakness in Berkeley’s philosophy, namely, his failure to address non-existence. Something that is levied against idealism is the suggestion that it contains the implication that things flit in and out of existence; for example, the tree exists only in so far as it is perceived, and when it is not perceived, it cannot exist. Ferrier recognizes that Berkeley’s account seems to suggest that the world exists only in so far as it is perceived. He believes that this makes him vulnerable to accusations of subjective idealism. To overcome this, Ferrier broadens Berkeley’s account to include non-existence.

There are two parts to his discussion of non-existence. First, he reiterates the Berkeleian argument that mind-independent objects cannot exist because it is impossible to conceive of them. He says that if a philosopher speaks of the world-as-it-is-in-itself (for instance, the world existing prior to and following the existence of percipient beings), they are obliged to posit an ideal percipient. For example, in order to think of the River Nile existing in a world where there are no percipient beings, one must think about it in terms of its perceivable qualities: size, color, boundaries and so forth. But, in thinking of such things, one is still thinking of the act of perception and not the thing-in-itself. Here, Ferrier returns to “the fundamental act of humanity.” He emphasizes that that which is perceived is inseparable from the act of perception; it is impossible to consider what is seen in isolation from the act of seeing, what is heard in isolation from the act of hearing, and so on.

Second, Ferrier asserts that this argument must be extended to included non-existence as well. Not only is the existence of the world inconceivable without a real or ideal perceiver, but also non-existence similarly requires such a perceiver. In order to conceive nothing, that is silence, colorlessness, tastelessness, and so forth, the philosopher must refer to her perceptual framework. He develops Berkeley’s view that existence is percipi by insisting that non-existence is also percipi. Using Kantian language, he argues that “no phenomena, not even … the phenomenon of the absence of phenomena, are thus independent or irrespective” (Ferrier 2001: vol. 3. 315). Ferrier contends that it is not only matter that depends upon perceivers but also the non-existence of matter. He says:

[U]niversal colourlessness, universal silence, universal impalpability, universal tastelessness, and so forth, are just as much phenomena requiring, in thought, the presence of an ideal percipient endowed with sight and hearing and taste and touch, as their more positive opposites were phenomena requiring such a percipient. (Ferrier 2001: vol. 3. 311)

In this way, non-existence is just as much a known concept as existence. In order to conceive of either the existence or the non-existence of the world, a percipient being, whether real or ideal, is required. By supplementing Berkeley’s theory in this manner, he believes it becomes invulnerable to accusations of subjective idealism; one cannot say that the world will cease to exist in the absence of percipient beings because percipient beings are required to conceive of the world ceasing to exist.

c. Critique of Reid

Although he died more than a decade before Ferrier was born, Thomas Reid’s influence on Scottish philosophy remained strong during Ferrier’s youth and career. Hamilton is famous for his annotated edition of Reid’s works, and while Ferrier professes admiration for Hamilton’s scholarship, he wholeheartedly rejects the focus of his intellect. In Ferrier’s view, Reid produced a form of realism that not only failed to overcome the representative theory of perception but also resulted in its own form of representationism. Additionally, for Ferrier, Reid’s commonsense philosophy is inadequate and anti-philosophical. Instead, he calls for a new Scottish philosophy that is more systematic and rational; that is, an idealist metaphysics.

Reid was a Berkeleyan in his youth, but Hume’s skepticism led him to reassess his philosophical assumptions, which, in turn, led him to reject the theory of ideas. A version of the theory of ideas can be found in a range of philosophers from Descartes to Hume. In general, this theory posits that ideas are the immediate objects of one’s mind. This epistemological belief allows for a variety of metaphysical positions, including: Locke’s realism, Berkeley’s idealism, and Hume’s skepticism. Reid recognized that Hume’s astute reasoning was the logical development of the theory of ideas. At the same time, he could not accept Hume’s conclusions that we must be skeptical about things such as the continued existence of objects or the continuation of one’s personal identity. Thus, Reid examined the foundations of this theory: the existence of ideas. He realized that he had no experience of ideas and concluded that they are philosophical constructs, which are at odds with common sense. According to Reid, all persons share a priori commonsense principles upon which all reasoning depends. For instance, the belief in the existence of the external world, the principle of causality, and the belief that one is the same person she was yesterday and will be tomorrow, all count among Reid’s principles of common sense. The aspect of Reid’s theory that is most important for Ferrier is his philosophy of perception. Reid holds that we perceive objects directly and not via intermediate entities such as ideas. In his view, all persons have a commonsense belief in the existence of the external world that is irresistible and prior to reasoning. In this way, Reid was said to remove representationism from the theory of perception; the objects of knowledge are the things themselves rather than representative intermediaries such as ideas. Ferrier, however, argues that Reid failed to disprove representationism and that Reid’s theory of perception retains a form of representationism.

A discussion of the perception of matter is central to Ferrier’s philosophical writings, and it is this issue that he believes demonstrates the central difference between Berkeley and the commonsense school. One of his main talking points is representationism. On this topic, he dismissively says that “Berkeley thus accomplished the very task which, fifty or sixty years afterwards, Reid laboured at in vain” (Ferrier 2001: vol. 1. 490). Ferrier believes that Reid and others have misunderstood Berkeley by mistaking him for a representationist. Yet, Ferrier believes that idealism—both his own and Berkeley’s—is the only type of philosophy that can overcome representationism. He criticizes Reid’s theory of perception throughout his published works, and his argument against him is best expressed in his article “Reid and the Philosophy of Common Sense.” Here, he refutes Reid’s realist account of perception and develops his own idealist theory.

Ferrier divides philosophical accounts of perception into two schools: the metaphysical school and the psychological school. His idealist metaphysics is an example of the former and Reid’s commonsense philosophy is an example of the latter. Both schools accept that the perception of matter occurs, yet, they disagree about what this entails. Ferrier considers “the perception of matter” to be a whole, indivisible unit:

In the estimation of metaphysic, the perception of matter is the absolutely elementary in cognition, the ne plus ultra of thought. Reason cannot get beyond, or behind it. It has no pedigree. It admits of no analysis. It is not a relation constituted by the coalescence of an objective and a subjective element. It is not a state or a modification of the human mind. It is not an effect which can be distinguished from its cause. It is not brought about by the presence of antecedent realities. It is positively the FIRST, with no forerunner. The perception-of-matter is one mental word, of which the verbal words are mere syllables. (Ferrier 2001: vol. 3. 410, 411)

On the other hand, there is the psychological school’s approach to the perception of matter, which considers the relation between two component parts: the subjective perception and the objective matter. And, in Ferrier’s view, this approach leads to representationism.

Representationists make a distinction between an immediate and a remote object of the mind. For instance, Locke argues that we know things in the world via our ideas; things are the indirect objects of our minds, whereas ideas are the immediate object of our minds. What Ferrier believes is that Reid and other “psychologists” similarly set up a remote and an immediate object of the mind in their accounts of perception. He argues that the psychological school holds that there is the material world which exists regardless of whether it is perceived or not and that there are percipient beings who know the material world via their perceptions of it. It follows that in this account of the perception of matter there is both an objective aspect (the external world) and a subjective aspect (the subject’s perception of that world). He observes that this creates both an immediate and a remote object of knowledge; the subject knows her perception of the world immediately, whereas she knows the world remotely and only via her perception of it. He says:

When a philosopher divides, or imagines that he divides, the perception of matter into two things, perception and matter; holding the former to be a state of his own mind, and the latter to be no such state; he does, in that analysis, and without saying one other word, avow himself to be a thoroughgoing representationist. For his analysis declares that, in perception, the mind has an immediate or proximate, and a mediate or remote object. Its perception of matter is the proximate object, the object of its consciousness; matter itself, the material existence, is the remote object—the object of its belief. (Ferrier 2001: vol. 3. 415)

Therefore, Ferrier suggests that in avoiding representationism, Reid and others are paradoxically guilty of the very thing that they are attempting to dispel. In order to truly avoid representationism Ferrier insists on an idealist account of perception. Again he returns to “the fundamental act of humanity.” In his view, the “perception of matter” is a composite that cannot be broken down into its constituent parts; subjects and objects are always presented at once and can never be separated.

While Ferrier’s critique of Reid’s analysis if the perception of matter is astute, at other times, he makes derogatory remarks about his predecessor in an ad hominem manner. For instance, he says that when Reid is considered alongside philosophers such as Berkeley or Hume, he is akin to a “whale in a field of clover” (Ferrier 2001: vol. 1. 495). Remarks such as these have more to do with the dominance of commonsense philosophy during his lifetime and the ways in which it hampered his own career than with a thoughtful analysis of Reid’s ideas. Yet, despite his dismissal of Reid and the philosophy of common sense, Ferrier, nevertheless, wants to retain the language of “common sense.” Indeed, he believes that his own idealism is an example of an enlightened system of common sense.

d. Idealist Metaphysics

One of Ferrier’s criticisms with the philosophy of common sense is that he believes it formalizes the inadequacies of ordinary thinking.

Common sense … is the problem of philosophy, and is plainly not to be solved by being set aside, but just as little is it to be solved by being taken for granted, or in other words, by being allowed to remain in the primary forms in which it is presented to our notice. (Ferrier 2001: vol. 3. 64)

By contrast, he thinks that philosophy should fulfill a corrective purpose; he says: “philosophy exists only to correct the inadvertencies of man’s ordinary thinking” (Ferrier 2001: vol. 1. 32). A rational consideration of the laws of thought is required to separate unrefined opinions from the “genuine principles of common sense.” This is exactly what he tries to achieve in his major work the Institutes of Metaphysic; here, he attempts to systematically reveal the laws of thought via reason.

The Institutes is arranged into three main books, which follow on from one another: the Epistemology, the Agnoiology or theory of ignorance, and finally the Ontology. Together, they comprise his idealist metaphysics. Unusually, for a philosophical work, the Institutes is written in a deductive style. Ferrier’s metaphysics are deduced from an axiomatic, self-evident principle. In the introduction to his Institutes he asserts that: “From this single proposition the whole system is deduced in a series of demonstrations, each of which professes to be as strict as any demonstration in Euclid, while the whole of them taken together constitute one great demonstration” (Ferrier 2001: vol. 1. 30). His “Epistemology” consists of twenty-two propositions, the “Agnoiology” has eight propositions, and he concludes with the eleven propositions that form his “Ontology.” Each proposition involves a demonstration and a subsequent discussion in which he posits a counter-proposition that he disproves.

While Ferrier’s own philosophy is largely unknown to contemporary epistemologists, it is noteworthy that he was the first philosopher in English to call the philosophy of knowledge “epistemology.” His own epistemology is central to his philosophy as is evident from the fact that it forms the largest part of his metaphysics. It is also the common focus that appears in all of his published works. In his 1841 article “The Crisis of Modern Speculation,” he says: “Before we can be entitled to speak of what is, we must ascertain what we can think” (Ferrier 2001: vol. 3. 272). And, this is a principle that he follows in the Institutes by grounding his metaphysics in his epistemology. For Ferrier, it is important to secure of the laws of thought before making any positive statements about reality. Thus, “Proposition I” or “the primary law or condition of all knowledge” is the axiom from which the rest of Ferrier’s system follows. It asserts that: “Along with whatever any intelligence knows, it must, as the ground or condition of knowledge, have some cognisance of itself” (Ferrier 2001: vol. 1. 79).

The first proposition asserts that self-consciousness is the necessary concomitant of all knowledge; in knowing anything (for example, “that Tuesday follows Monday,” or “that one is reading Ferrier’s metaphysics”), at the same time, a person knows herself. In this way, Ferrier’s Institutes are the natural development of his work on consciousness; self-consciousness, as the peculiar feature of humanity, shapes his entire metaphysics. From this starting point, the main deductive conclusion that follows is that the minimum unit of cognition requires some self in union with some object. This forms Ferrier’s conception of the absolute; for Ferrier, a synthesis of subject-with-object is the absolute in knowledge.

If that which can be known must be a synthesis of subject-with-object, then, this is a union, which cannot be broken down into its constituent parts. As such, there can be no mere objects or matter per se. He says:

Everything which I, or any intelligence, can apprehend, is steeped primordially in me … Whether the object be what we call a thing or what we call a thought, it is equally impossible for any effort of thinking to grasp it as an intelligible thing or as an intelligible thought, when placed out of all connection with the ego. This is a necessary truth of all reason—an inviolable law of all knowledge. (Ferrier 2001: vol. 1. 120)

Hence, in perception, there can be no objects as they are, independent of knowers (typically known as things-in-themselves or noumena). For Ferrier, things-in-themselves are not objects of knowledge; they are unthinkable and as such they are the contradictory and unknowable by any mind, including by a supreme knower. In rejecting things-in-themselves, he has in mind Reid but also Hamilton and Kant as well as any philosophers who hold that there is a noumenal world. In his idealist epistemology, the notion of a thing-in-itself contradicts the laws of thought; one cannot conceive of a thing-in-itself because the synthesis of subject-with-object is the minimum unit of cognition, which cannot be broken down. Similarly, subjects-in-themselves are unknowable by all minds, including that of a supreme knower. In this way, the ego or self in itself is unknowable. While the self is the constant concomitant of all knowledge, there must also be an object that it is conjoined with. Ferrier calls the self the universal in all knowledge and the object is the particular in all knowledge.

Once he has established what can be known, he wants to reveal what cannot be known. Thus, in his Agnoiology he considers what, if anything, is a possible object of ignorance. This is one of the most unique and interesting features of Ferrier’s philosophy because the philosophy of ignorance has been given limited attention in the history of philosophy. His definition of ignorance is: not knowing that which could be known. In his view, ignorance involves a deficit or a privation of knowledge; it is a failure by the knower, to know something that could be known. In some cases, this might be a result of one’s limited constitution; for instance, a finite knower has more limited abilities for cognition than a supreme knower and there are some things that a finite knower could never know but are nevertheless the object of knowledge for some knower. In other cases, this might be a failure of will or effort; for instance, one might not know the time of day at a given moment, although that is something that could be rectified. By contrast, there are things that could never be known by any knower, including a supreme knower. This is what Ferrier designates the contradictory. For instance, no one, including a supreme knower, could know that 2 + 2 = 5 because this violates the laws of reason. For Ferrier, not knowing the contradictory is not ignorance but rather evidence of the strength of reason. Thus, “Proposition III” of his “Agnoiology” or “the law of all ignorance” asserts that: “We can only be ignorant of what can possibly be known; in other words, there can be an ignorance only of that of which there can be a knowledge” (Ferrier 2001: vol. 1. 412).

Given that in his “Epistemology” he has already concluded that the object of knowledge must be a synthesis of subject-with-object, the central conclusion of the “Agnoiology” is that that which we are ignorant of is a synthesis of subject-with-object, or in other words, the absolute in cognition. That which is the object of knowledge is some synthesis of subject-with-object. That which is the object of ignorance is some synthesis of subject-with-object. Thus, the possible objects of knowledge and ignorance are one and the same: the absolute in cognition. It follows that matter per se and the ego per se are neither the objects of knowledge nor ignorance. He returns to his contention that his idealism is in line with common sense when he says:

Novel, and somewhat startling, as this doctrine may seem, it will be found, on reflection, to be the only one that is consistent with the dictates of an enlightened common sense … If we are ignorant at all (and who will question our ignorance?) we must be ignorant of something; and this something is not nothing, nor is it the contradictory. (Ferrier 2001: vol. 1. 434)

Once Ferrier has established that the absolute must be the object of knowledge and ignorance, he moves to the question of being and considers what is. His “Ontology” directly follows from his “Epistemology” and the “Agnoiology.” In the opening proposition of this section he sets out the possibilities for that which is, which he refers to as “Absolute Existence.” It must be that which is (1) an object of knowledge, (2) that which is an object of ignorance, or (3) that which is neither an object of knowledge nor an object of ignorance. That which we can neither know nor be ignorant of is the contradictory and as such cannot be that which absolutely exists; Ferrier argues that this is a conclusion that even skeptics must allow for. He says:

No form of scepticism has ever questioned the fact that something absolutely exists, or has ever maintained that this something was the nonsensical. The sceptic, even when he carries his opinions to an extreme, merely doubts or denies our competency to find out and declare what absolutely exists. (Ferrier 2001: vol. 1. 466)

Therefore, that which exists must be the object of knowledge or ignorance, or, in other words, it is the absolute: a synthesis of subject-with-object.

The influence of Berkeley again becomes apparent in the development of his idealist ontology because he concludes the Institutes with the proposition that there is only one necessary absolute existence, namely, a supreme mind in synthesis with the universe. He says: “All absolute existences are contingent except one; in other words, there is One, but only one, Absolute Existence which is strictly necessary; and that existence is a supreme and infinite, and everlasting Mind in synthesis with all things” (Ferrier 2001: vol. 1. 522).  Grounding Ferrier’s metaphysics is the notion that God is both the supreme knower and the only necessary knower. Every other knower is finite and contingent; therefore, the existence of reality cannot depend on them. Ferrier argues that reason dictates that there must be a supreme mind to prevent the universe from being contradictory. This is because objects per se are contradictory. Therefore, the universe, which constitutes the objective part of knowledge, must be in conjunction with some subject in order to provide it with existence.

3. Reception and Influence

Ferrier was arguably the best Scottish philosopher of his generation. However, his contemporaries did not uniformly welcome his idealist metaphysics, believing the Institutes to be too far removed from the philosophy of his predecessors. Commonsense philosophy was dominant in the Scottish universities in the decades following Reid’s death. Subsequent generations of philosophers from Dugald Stewart to Hamilton defended some version of commonsense philosophy, which led nineteenth-century writers such as Ferrier, Andrew Seth Pringle-Pattison, and James McCosh to speak of a tradition of “Scottish philosophy.” In the history of Scottish philosophy, the role of the universities was of considerable importance, and acquiring a key university Chair often signified the status of the philosopher at the time. Many important philosophers held such academic chairs; for instance, both Adam Smith and Thomas Reid held the Chair of Moral Philosophy at Glasgow, Dugald Stewart was the Chair of Moral Philosophy at Edinburgh, and Sir William Hamilton was the Chair of Logic and Metaphysics at Edinburgh. A notable exception to this list is David Hume who unsuccessfully tried to acquire Chairs of philosophy at both Edinburgh and Glasgow. In many respects, Ferrier was the obvious candidate to succeed Hamilton in the esteemed Chair of Logic and Metaphysics at Edinburgh. Although Hamilton was best known for his editions of Reid’s works, he tried to combine Reid with Kant, while placing a greater emphasis on metaphysics than there had been before. Ferrier developed this tendency towards metaphysics even further with his idealism and his rejection of Reid’s commonsense philosophy. Additionally, Ferrier had taught in place of Hamilton during his mentor’s illness during the forties, and he was highly esteemed by Hamilton and others for his philosophical acuity. Nevertheless, Ferrier was unsuccessful in his attempt to acquire the Chair of Logic and Metaphysics in 1856, losing out to the lesser-known Alexander Campbell Fraser.

He reacted angrily to his defeat and it led him to produce his polemical work Scottish Philosophy: The Old and the New, which is a defense of his philosophical system as well as a scathing attack on his opponents. Ferrier’s animosity is not directed at Fraser; instead, he targets those who campaigned against him as well as Edinburgh’s Town Council who were responsible for appointing Hamilton’s successor. Here, he employs extraordinary rhetoric to argue that there is a distinction between old and new Scottish philosophy. In his analysis, his idealist metaphysics represents a “new Scottish philosophy,” whereas adherence to Reid and Hamilton is equivalent to perpetuating the “old Scottish philosophy.” In the campaign against Ferrier, his idealism was portrayed as being insufficiently Scottish. He replies that his philosophy is quintessentially Scottish even though it differs from Reid and Hamilton in certain respects. He says: “Philosophy is not traditional. As a mere inheritance it carries no benefit to either man or boy. The more it is a received dogmatic, the less it is a quickening process” (Ferrier 1856: 9). To discredit Ferrier his philosophy was compared to both Hegel and Spinoza with associations of pantheism and atheism mixed with nationalism and xenophobia. Ferrier denies the accusation that his philosophy is Hegelian and points out that claims to the contrary are simply propaganda. Moreover, he responds to suggestions that his philosophy is similar to Spinoza’s by wholeheartedly demonstrating his antipathy toward those who campaigned against him: “all the outcry which has been raised against Spinoza has its origin in nothing but ignorance, hypocrisy, and cant” (Ferrier 1856: 14). Ferrier was educated in the Scottish tradition, and the work he created was in direct reaction to it. The difference between Ferrier’s Institutes of Metaphysic and Reid’s philosophy of common sense is substantial. However, the difference between Ferrier’s thought and Hamilton’s is less dramatic.

Ironically, some decades later, the association with Hegel did not carry a negative connation. Alexander Campbell Fraser went on to teach several of the British Idealists of the latter part of the nineteenth century, and Edward Caird, an avowed Hegelian, was the Professor of Moral Philosophy in Glasgow for several years. The idealist R. B. Haldane summed up this change in attitude when he said: “The Time-Spirit is fond of revenges” (Haldane 1899: 9). In retrospect, Ferrier’s idealism appeared a few decades too early to be received by a receptive audience.

4. References and Further Reading

a. Primary Sources

  • Ferrier, James Frederick, Philosophical Works of James Frederick Ferrier, 3 vols: i. Institutes of Metaphysic, ii. Lectures on Greek Philosophy, iii. Philosophical Remains, Bristol: Thoemmes Press, 2001.
  • Ferrier, James Frederick, Scottish Philosophy: The Old and the New, Edinburgh: Sutherland and Knox, 1856.

b. Secondary Sources

  • Boucher, David, “Introduction” in The Scottish Idealists: Selected Philosophical Writings, Exeter: Imprint Academic, 2004.
  • Broadie, Alexander, A History of Scottish Philosophy, Edinburgh: Edinburgh University Press, 2009.
  • Cairns, Revd. J, An Examination of Professor Ferrier’s “Theory of Knowing and Being,” Edinburgh: Thomas Constable and Co, 1856.
  • Davie, George, Ferrier and the Blackout of the Scottish Enlightenment. Edinburgh: Edinburgh Review, 2003.
  • Davie, George, The Democratic Intellect: Scotland and Her Universities in the Nineteenth Century. Edinburgh: Edinburgh University Press, 1961.
  • Davie, George, The Scotch Metaphysics A Century of Enlightenment in Scotland. London: Routledge, 2001.
  • Ferreira, Phillip, “James Frederick Ferrier” in A. C. Grayling, Naomi Goulder, and Andrew Pyle (eds.), Continuum Encyclopedia of British Philosophy, London: Thoemmes Continuum, 2006, ii. 1085-1087.
  • Fraser, Alexander Campbell, “Ferrier’s Theory of Knowing and Being” in Essays in Philosophy. Edinburgh: W.P. Kennedy, 1856.
  • Graham, Graham (ed.), Scottish Philosophy in the Nineteenth and Twentieth Centuries, Oxford: Oxford University Press, 2015.
  • Graham, Graham, “The Nineteenth-Century Aftermath” in Broadie, Alexander ed. The Cambridge Companion to the Scottish Enlightenment, Cambridge: Cambridge University Press, 2003.
  • Haldane, E. S., James Frederick Ferrier. Edinburgh and London: Oliphant Anderson & Ferrier, 1899.
  • Haldane, John, “Introduction” in Ferrier, James Frederick, Philosophical Works of James Frederick Ferrier, Bristol: Thoemmes Press, i. Institutes of Metaphysic, 2001.
  • Jaffro, Laurent, “Reid said the business, but Berkeley did it.” Ferrier interprète de l’immatérialisme in Revue philosophique de la France et de l’étranger 135: 1, pp.135-149, 2010.
  • Keefe, Jenny, “James Ferrier and the Theory of Ignorance” in The Monist, Volume 90, No.2, pp.297-309, 2007.
  • Keefe, Jenny, “The Return to Berkeley” in British Journal for the History of Philosophy, Volume 15, Issue 1, pp.101-113, 2007.
  • Lushington, E. L., “Introductory Notice” in Ferrier, James Frederick, Philosophical Works of James Frederick Ferrier, Bristol: Thoemmes Press, ii. Lectures on Greek Philosophy, 2001.
  • Mander, W. J., British Idealism: A History, Oxford: Oxford University Press, 2011.
  • Mander, W. J. and Panagakou, S., British Idealism and the Concept of the Self, London: Palgrave Macmillan, 2016.
  • Mander, W. J. (ed.), The Oxford Handbook of British Philosophy in the Nineteenth Century, Oxford: Oxford University Press, 2014.
  • Mayo, Bernard, “The Moral and the Physical Order: A Reappraisal of James Frederick Ferrier,” Inaugural Lecture, University of St Andrews, 1969.
  • McCosh, James, The Scottish Philosophy, New York: Robert Carter and Brothers, 1875.
  • McDermid, Douglas, “Ferrier and the Myth of Scottish Common Sense Realism” in Journal of Scottish Philosophy, Volume 11, Issue 1, pp.87-107, 2013.
  • McDermid, Douglas, The Rise and Fall of Scottish Common Sense Realism, Oxford: Oxford University Press, 2018.
  • Muirhead, J. H., The Platonic Tradition in Anglo-Saxon Philosophy, London: George Allen & Unwin, 1931.
  • Segerstedt, Torgny T., The Problem of Knowledge in Scottish Philosophy (Reid-Stewart-Hamilton-Ferrier). Lund: Gleerup, 1931.
  • Seth, Andrew, Scottish Philosophy: A Comparison of the Scottish and German Answers to Hume, Edinburgh and London: William Blackwood and Sons, 1885.
  • Sorley, W. R., A History of English Philosophy, Cambridge: Cambridge University Press, 1920.
  • Thomson, Arthur, Ferrier of St Andrews: An Academic Tragedy, Edinburgh: Scottish Academic Press, 1985.
  • The Testimonials of J.F. Ferrier, Candidate for the Chair of Moral Philosophy in the University of Edinburgh, Second Series, 1852.

 

Author Information

Jenny Keefe
Email: keefe@uwp.edu
University of Wisconsin–Parkside
U. S. A.

Eduard Hanslick (1825–1904)

Eduard Hanslick was a Prague-born Austrian aesthetic theorist, music critic, and the first professor of aesthetics and history of music at the University of Vienna, who is commonly considered the founder of musical formalism in aesthetics. His seminal treatise Vom Musikalisch-Schönen (On the Musically Beautiful) of 1854 is one of the most significant contributions to musical aesthetics ever written, as is evident from the ten editions the book went through during Hanslick’s lifetime, with many editions to follow. Hanslick’s classic treatise has been translated into English as early as 1891. On the Musically Beautiful, or OMB, posits an aesthetic approach to music derived solely from its specific material features that helped to shape the fields of aesthetics and musicology up to our own day. Hanslick’s scientific and objectivist orientation, his critical attitude towards metaphysics, and his theory of emotion—strikingly reminiscent of modern cognitive concepts—guarantee his continued relevance for current debates.

OMB is notorious primarily for its ostensible repudiation of any pertinent connection between music and affect states. Hanslick’s concept of music, according to this view, is based solely on the formal aspects of pure music that does not arouse, express, represent, or allude to human emotion in any way relevant to its artistic essence: The content of music, Hanslick (in)famously proclaimed, consists entirely of “sonically moved forms.”

This article provides an introduction to Hanslick’s biography, his early music reviews, which differ considerably from the eventual opinions he is commonly associated with, and portrays the key arguments of Hanslick’s aesthetic approach as presented in OMB, including a reconstruction of the complex genesis of this book. The concluding paragraphs encompass an overview of several crucial sources of Hanslick’s viewpoint, seemingly oscillating between German idealism and Austrian positivism, as well as a concise history of Hanslick’s reception in analytical philosophy of music, which continues to struggle with the issues posed by Hanslick’s cognitive concept of emotion and has drafted numerous strategies to circumvent Hanslick’s skeptical outcome.

Table of Contents

  1. Biography
  2. Early Works and Critical Writings
  3. Vom Musikalisch-Schönen / On the Musically Beautiful
    1. Genesis and Conceptual Organization of OMB
    2. Purpose, Methods, and General Outlook of OMB
    3. Arousal, Expression, and the Cognitive Concept of Emotion
    4. The Musically Beautiful and Music’s Relation to History
    5. Listening, Music’s Relation to Nature, and Music’s Content
    6. Conclusion: The Curious Nature of Hanslick’s Formalism
  4. The Intellectual Background of Hanslick’s Aesthetics
    1. Hanslick and German Idealism
    2. Hanslick and Austrian Realism
    3. Editorial Problems and Eclectic Origins of OMB
  5. The Reception of Hanslick’s Aesthetics and Its Relevance to Current Discourse
    1. A General Outline of Hanslick’s Reception by Austro-German Discourse
    2. Hanslick’s Reception by Analytical Aesthetics and the Direct Impact of OMB
    3. Bypassing Hanslick’s Cognitivist Arguments: Kivy, Davies, and Moods
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Eduard Hanslick, who Germanized his surname by inserting a “c” upon his move to Vienna in 1846, was born in Prague on September 11, 1825 as the son of Josef Adolf (1785–1859) and Karoline Hanslik (1796–1843), daughter of the Jewish court factor Salomon Abraham Kisch (1768–1840). According to Hanslick’s memoirs, his father was responsible for his education and thus may have sparked his interest in aesthetics, as Josef Adolf edited the two volumes of Johann Heinrich Dambeck’s Vorlesungen über Ästhetik (Lectures on Aesthetics, 182223) and filled in as Dambeck’s substitute in 1816­–17, teaching aesthetics at Prague’s Charles University. Hanslick, who also took lessons with the renowned composer Václav Tomášek (1774–1850), completed his philosophical elementary studies—a three-year course in general education mandatory for all prospective university attendees—between 1840 and 1843, enrolled in law at Prague, and attained his doctoral degree in Vienna in 1849 (on Hanslick’s early days, see Grey 2002, 828–29; Grey 2011, 360–61; Hanslick 2018, xv–xvi). Hanslick’s background in law had significant influence on his philosophical methodology as his standard for evidence and his emphasis on “proximate causes” (Hanslick 1986, 32)—which limit the chain of “admissible causes-in-fact” and enable Hanslick’s strong focus on “the music itself” instead of the listener, performer, or composer (Pryer 2013, 55)—are clearly derived from juridical training. After a short-lived employment as a fiscal civil servant in Klagenfurt (Carinthia) in 1850–52, during which Hanslick prepared for an academic profession (Wilfing 2018, 91n), he returned to Vienna to work at the ministry of finances and was subsequently transferred to the ministry of education in 1854.

This move proved crucial for Hanslick’s future career, as Count Thun-Hohenstein (1811–88), who led the education department from 1849 to 1860, had been charged with the overall reform of Austrian education following the 1848–49 revolution, and Hanslick thus came into direct contact with Thun’s agenda and the demands of the science policies of the Hapsburg Monarchy. The initial traces of the book he would become famous for also fall within this time frame, with OMB completed in 1854. In 1856, this book was acknowledged retroactively as a philosophical habilitation, thereby granting Hanslick an unsalaried professorship at the University of Vienna that turned into a salaried position in 1861, and ultimately a full post in 1870. Hanslick retained this post until he retired in 1895, and his successor Guido Adler (1855–1941) was appointed as professor of theory and history of music, a designation diverging markedly from Hanslick’s emphasis on aesthetics. Hanslick was established profoundly in the cultural and musical scenery of Vienna: he consulted in awarding public music grants and judged musical contests, was an official Austrian delegate at international conferences and world fairs, and he became the first chair of Denkmäler der Tonkunst in Österreich (Monuments of Musical Art in Austria) from 1893 to 1897, a society editing musical pieces of historic bearing on Austria until today. In addition to his academic activities, Hanslick experienced a widely successful career as a music critic (see the next section), which lasted until 1895, when Hanslick retired from his music editor post at Neue Freie Presse. Despite his retirement, Hanslick continued to publish criticism in this very journal until his death in 1904, with the last text to appear on April 7, two months before his passing—an event noted as far as the Musical Times and the  New York Times (McColl 1995).

2. Early Works and Critical Writings

Except for his aesthetic treatise, Hanslick is renowned primarily for his activities as a music critic. As philosophical commentators usually concern themselves exclusively with OMB, the present section will briefly sketch Hanslick’s relevance in 19th-century musical discourse and will also indicate the diversity of his critical position. Today, Hanslick is known best for his skeptical attitude towards the New German School—a vague label for a loose group that is thought to comprise composers such as Hector Berlioz (1803–69), Franz Liszt (1811–86), and Richard Wagner (1813–83), but does also refer to influential journalists such as Franz Brendel (1811–68), editor-in-chief of Neue Zeitschrift für Musik. Hanslick’s career as a music critic started early on as an occasional contributor to Beiblätter zu Ost und West (Prague 1844) and—upon his move to Vienna in 1846—the Wiener Allgemeine Musik-Zeitung, ultimately transferring to the imperial Wiener Zeitung in 1848, prior to his music editor posts at Die Presse (1855–64) and its liberal offshoot Neue Freie Presse (1864–95). At that time, Hanslick proved to be an advocate of composers he would eventually disapprove of, such as Berlioz, who was called the “most magnificent phenomenon in… musical poetry,” and Wagner, who was proclaimed the “greatest dramatic talent among living composers” (Hanslick 1993, 40, 59; for the latter review, see Hanslick 1950, 33–45). Hanslick, who was acquainted personally with important composers of his era—he met Wagner as early as 1845 and acted as a local guide for Berlioz in 1846 (Payzant 1991 and 2002, 63–71)—at that time professed a romantic outlook (Yoshida 2001, 181–84) and deemed “pure” music a “language of the emotions” and the “revelation of the innermost world of ideas” (Hanslick 1993, 98, 115). For readers of an aesthetic theorist commonly associated with the “repudiation” of emotive musical meaning (Budd 1980) and the proponent of a classicist conception of music that does not refer to anything beyond itself, Hanslick’s 1848 essay on “Censorship and Art-Criticism” must seem particularly surprising. In this text, he condemns the “inadequate perspective that saw in music merely a symmetrical succession of pleasing tones.” Truly artful music, he continues, represents “more than music”; it is a “reflection of the philosophical, religious, and political world-views” of its time (Hanslick 1993, 157).

In the early 1850s, however, Hanslick’s outlook on music shifted considerably and eventually developed into a more “formalist” viewpoint that inverted his previously positive appraisal of Wanger’s operas. Although an exact date or a conclusive inducement for his “volte-face” (Payzant 1991, 107) is hard to determine definitively, the classicist writings of the Prague music critic Bernhard Gutt (1812–49), from whom he adopted multiple quotations (Payzant 1989), the failed political upheaval of 1848–49, and the resulting execution of his cherished colleague Alfred Julius Becher (1803–48) seem to be crucial reasons for Hanslick’s change of opinion (Bonds 2014, 153–54; Landerer and Wilfing 2018, sec. 2). Whereas Hanslick regarded “pure” music as an exhaustive repository for intellectual reflection that exerts tangible impact on the world of politics and religion in 1848, he from this time on develops a more formalistic conception of musical artworks that emphasizes their essentially autonomous nature. In making this move, Hanslick took part in the general erosion of Hegelian criticism, the political direction of which lost most of its appeal in the aftermath of 1848 (Pederson 1996), and entirely detached music and its aesthetic qualities from its involvement with worldly politics. Whereas the political activities of other critics ceased while they retained crucial elements of Hegelian aesthetics, such as emotivism or its focus on concrete content, Hanslick’s reversal was virtually complete. This turn is observable particularly in respect to the debate about external musical meaning that Hanslick declared the pivotal feature of art in 1848. A few years later, prior to the initial edition of OMB in 1854, he had reversed his attitude entirely by stating that “if an orchestral composition requires external means of conceptual understanding [that is, a literary program] in order to please… then its musical value already appears to be in question” (Hanslick 1994, 293). Hanslick’s notion of music’s nature thus shifted from a romantic position emphasizing conceptual meaning to an appraisal of internal musical meaning oriented towards formal issues such as the inherent potential of the main theme or the clarity of melodic figures (Payzant 2002, 88–91, 96–98, 117–19).

Although Hanslick therefore adopted a critical attitude towards the New German School in later years and took issue with its poetization of “pure” music (Larkin 2013), certain matters have to be kept in mind that challenge the widespread assumption of Hanslick being a “stodgy, pedantic spokesperson for ‘conservative’ musical causes” (Gooley 2011, 289). Hanslick’s criticism of Wagner and his followers generally concerned the musical aspects of their works and deplored an absence of motivic-thematic manipulation or an overly rigorous devotion to a literary program that supposedly interfered with the “organic” unfolding of melody. His general valuation of these works, however, often proves to be astoundingly differentiated (on Hanslick’s appraisal of Wagner, see Grey 1995, 1–50; Pederson 2013, 176–77; Bonds 2014, 237–46). Although Hanslick assessed Der Ring des Nibelungen in 1876 to be “a distortion, a perversion of basic musical laws,” he was at the same time able to realize that Wagner’s tetralogy represents “a remarkable development in cultural history” (Hanslick 1950, 139, 129). It is beyond serious debate that Hanslick preferred Beethoven (1770–1827), Brahms (1833–97), and Mozart (1756–91) to Mahler (1860–1911), Strauss (1864–1949), or the Wagner “school.” Hanslick, however, did not panegyrize his preferred musicians as he did not condemn his “opponents” without reservation. Although Hanslick bemoaned Wagner’s musical system, his continuous modulations, and the dubious semantic qualities of the Leitmotiv—which he called “musical uniforms”—he nonetheless appreciated his “genius for theatrical effect” (Hanslick 1950, 121, 151) and stressed the musical virtues of specific sections of Wagner’s operas. As he clarified in 1889: “Only a fool or dedicated factionist” would answer the question of Wagner’s qualities “with two words: ‘I idolize him!’ or ‘I abhor him!’” (Hanslick 1889, 56). Furthermore, Hanslick critically (and sometimes financially) supported more modernistic composers such as Bedřich Smetana (1824–84) or Antonín Dvořák (1841–1904) as long as their general artistic principles conformed to his aesthetic approach to a certain degree (Brodbeck 2007 and 2009; Larkin 2013).

3. Vom Musikalisch-Schönen / On the Musically Beautiful

a. Genesis and Conceptual Organization of OMB

From July 1853 to March 1854, Hanslick pre-published several chapters of OMB as stand-alone articles that deal with the subjective impression and (physiological) perception of music, as well as with the complex relations between music and nature. His three-piece essay “On the Subjective Impression of Music and its Position in Aesthetics” (Hanslick 1853) was eventually transformed into chapters 4 and 5 of the finalized manuscript, whereas “Music in its Relations to Nature” (Hanslick 1854)—itself based on a public lecture of 1851—turned into chapter 6, with both texts running through hardly any significant alterations. Scholarship on the actual genesis of OMB is rather sparse, as Hanslick’s private records were lost during the Second World War (Wilfing 2018, sec. 1), and has not yet reached a consensus regarding the chronological development of Hanslick’s momentous monograph. Whereas Geoffrey Payzant surmised that Hanslick’s articles were taken from the final version of OMB (Payzant 1985, 180), recent research points to the logical order of Hanslick’s argument that runs counter to the familiar sequence of published chapters in OMB and assumes that these three chapters (4–6) were indeed written prior to the more famous chapters 1 to 3, therefore presenting the nucleus of OMB (Landerer and Wilfing 2018, sec. 4; Hanslick 2018, xvii–xix). According to this view, Hanslick first lays the foundation for his aesthetic approach by clarifying an idea of tone (chapter 6) and the way in which tones are received from the standpoint of physiology and psychology (chapters 4 and 5). This analysis is followed by Hanslick’s concept of emotion, how emotions are predicated upon these physiological and psychological responses, and what role emotions play in musical aesthetics (chapters 1–2). Finally, following Hanslick’s hypothesis that emotion does not form a substantial component of objectivist aesthetics, he presents his positive thesis (chapter 3) and closes his argument with concluding comments that summarize his key findings and widen the conceptual framework of OMB (chapter 7).

b. Purpose, Methods, and General Outlook of OMB

Hanslick did not write any other academic works apart from OMB and the Geschichte des Concertwesens in Wien (History of Concert in Vienna, 1869) and focused his literary output almost entirely on reviews. Why did he decide to publish an aesthetic treatise at the age of 29? The reason given by Hanslick himself is to provide a critique of aesthetic emotivism that dominated mid-century discourse and to challenge the “advocates of the music of the future,” who supposedly endangered the “independent significance of music” (Hanslick 2018, lxxxv). By directly accusing Liszt and Wagner of belittling the inherent qualities of “pure” music, Hanslick contributed significantly to the view that OMB has to be read as a book directed against Wagner—a view that was conducive for the longevity of Hanslick’s treatise through the discussions surrounding the New German School. Even though there is some truth to this claim, scholars contest that Wagner’s music could be actually regarded as the prime spark for the production of OMB (Grey 2003, 169; Brodbeck 2014, 50), not least of all since Wagner’s later works that Hanslick specifically disapproved of were not yet written and Wagner’s name rarely appears in the initial edition of Hanslick’s treatise (several quotes from Wagner’s theoretical writings are belatedly included in the sixth edition of 1881). Wagner’s music—even though it was a useful target in order to remain relevant—thus does not seem to be the crucial reason for writing OMB, as the conceptual framework of Hanslick’s argument would have been very much the same “had the figure of Wagner not been there” (Bujić 1988, 8). A more tangible motive seems to be Hanslick’s very early aspiration towards an academic profession in order to leave behind his rather tedious employment as a public servant. We know from letters written around 1851 that Hanslick noticed the absence of musical aesthetics and musicology from the Viennese university curriculum and saw the opportunity to carve a niche for his unique talent. In light of Hanslick’s academic ambitions, it comes as no surprise that OMB does not start with a theoretical definition of art, music, or beauty. On the contrary, Hanslick’s examination commences with an exhaustive definition of musical aesthetics as a scientific discipline.

Whereas romantic aesthetic theorists had occupied themselves with music’s relation to affect states, feelings, and emotions, scientific aesthetics should focus on the object itself instead of its (historical) production or (arbitrary) reception. If musical aesthetics is to become scientific, Hanslick proclaims in a sentence that strikingly anticipates Edmund Husserl’s (1859–1938) phenomenology (Wilfing 2016, 24–25), it has to “approach the natural scientific method at least as far as trying to penetrate to the things themselves” (Hanslick 2018, 1). Furthermore, the specified aesthetics of music should detach itself from any theoretical dependency on a general concept of artistic beauty that is employed to categorize “pure” music ex post facto. German idealism typically contrived an aesthetic approach firmly rooted in an overarching philosophical framework. Art, regardless of the specific medium, thus must satisfy certain epistemic principles and ethical criteria derived from this general system in order to be classified as beautiful. Idealist aesthetics therefore typically identified universal conditions of artistic beauty that were binding equally for a poem, a tragedy, a painting, a sculpture, or a piece of music (Wilfing 2018, sec. 3.3). For Hanslick, this system-bound approach was completely misguided as he is concerned exclusively with musical beauty, the “musically-beautiful,” so that it is even hard to see how his notion of specific musical beauty is related to any general concept of beauty (Bonds 2014, 190). For him, the “laws of beauty of each art are inseparable from the characteristics of its material, of its technique” (Hanslick 2018, 2). For this reason alone, Payzant’s rendition of Vom Musikalisch-Schönen as On the Musically Beautiful captures Hanslick’s ideas much better than Cohen’s The Beautiful in Music that suggests an aesthetic approach contrary to Hanslick’s intentions: he did not propose an abstract principle of artistic beauty, administered retroactively to “pure” music, but was interested principally in beauty solely and explicitly manifest in the art of tones (Hamilton 2007, 81; Bonds 2014, 190).

c. Arousal, Expression, and the Cognitive Concept of Emotion

To this end, Hanslick develops two central theses: a positive one, explored in chapter 3, that attempts to show that musical beauty is dependent completely on the inherent qualities of music itself, and a negative one, defined in chapters 1–2, that challenges the familiar concept that music is supposed to represent feelings and that its emotive content forms the basis of aesthetic judgment. Both ideas share common ground in Hanslick’s objective approach: as the musical artwork and its material features represent the core of Hanslick’s aesthetics, the “subjective impression” of music, its emotive impact, is relegated to a secondary aftereffect of musical material. We must thus “stick to the rule that in aesthetic investigations primarily the beautiful object, and not the perceiving subject, is to be researched” (Hanslick 2018, 2–3). Hanslick specifically addresses two ways in which music is thought to be related to affect states: (1) The idea that music’s purpose is to arouse emotion and (2) that emotions represent the content of musical artworks (an assumption employed frequently to compensate for the lack of notional meaning in music alone). The first stance is countered by the classical argument of beauty having no purpose and “content of its own other than itself.” Beauty may very well arouse pleasant feelings in the perceiving individual, but to do so is not at all constitutive for the musically beautiful that exists apart from the listener’s cognition and remains beautiful “even if it is neither viewed nor contemplated. The beautiful is thus namely merely for the pleasure of the viewing subject, but not by means of the subject” (Hanslick 2018, 4). In an argument that anticipates Edmund Gurney’s (1847–88) renowned distinction between impressive music and expressive music (Gurney 1880, 314), Hanslick moreover maintains that music’s beauty and its emotive impact do not correlate inevitably. Thus, a beautiful composition may not arouse any specific feelings, whilst the strong emotive impact of another musical piece does not necessarily substantiate its aesthetic qualities (Hanslick 2018, 31–33; Robert Yanal 2006 dubs this idea the “third thesis” of OMB). In general, emotive arousal—for the most part depending on individual experience, musical edification, historical discourse, and so on—cannot provide a reasonable foundation for scientific aesthetics as it exhibits “neither the necessity nor the exclusivity nor the consistency” required to establish an aesthetic principle (Hanslick 2018, 9).

In chapter 2 of OMB, Hanslick presents his key argument against emotion forming the content of “pure” music by introducing his cognitive concept of emotion—a concept that brought his treatise to the forefront of analytical aesthetics. There was widespread consensus amongst idealist systems of art that art must have some sort of content. As “pure” music lacks tangible meaning, romantic theorists invoked the opposite of conceptual definiteness as the obvious candidate for music’s content: emotion (love, fear, anger, and the like). This claim, Hanslick maintains, represents the weak spot of musical emotivism. Emotion by no means forms the conceptless counterpart to literary meaning. On the contrary, emotions are “dependent on physiological and pathological conditions” and are invoked by “mental images, judgments, in short by the entire range of intelligible and rational thought” (Hanslick 2018, 15). The analytical philosopher Peter Kivy (1990, chap. 8) popularized this view with a practical example: If I assume that uncle Charlie is cheating during a card game, the anger I experience is contingent on the object of my emotion, Charlie. However, in order to be angry, a complex structure of cognitive parameters has to be in place. I must consider cheating an immoral or indecent behavior—a belief built upon some sort of ethical system—that is performed purposely by Charlie. As soon as I spot that Charlie is not deceitful wittingly and has played the wrong cards by accident, my anger is likely to evaporate, as its conceptual foundation disappears. Emotion, in short, needs an intentional object to be an emotion—an object that “pure” music is unable to provide. As music lacks the “cognitive mechanism” necessary to portray the objects of concrete emotions, the depiction of a specific feeling “does not at all lie within music’s own capabilities” (Hanslick 2018, 15–16). However, music alone can express the dynamic features of emotions via its own musical impetus and is thus able to portray “one aspect of feeling, not feeling itself” (Hanslick 2018, 18). Thus, even though music alone cannot express love, fear, or anger in a direct manner, its dynamic structure can reproduce the associated movement of concrete emotions or actual events (Hanslick 2018, 30), but not in ways that allow for definite meaning, as the dynamic character of love or anger could both be violent, desperate, or passionate in specific instances.

Hanslick’s exact stance on the relation of emotion and “pure” music represents a major point of contention in current research. Several scholars hold that Hanslick severed any relevant bonds between music and affect states, so that music itself “has nothing to do with emotion” (Zangwill 2004, 29) and emotions in turn have “nothing to do with musical beauty” (Lippman 1992, 299). Other scholars point to the preface of Hanslick’s treatise, in which he states that for him the value of beauty is based on “the direct evidence of feeling” and that his protest only pertains to the “mistaken intrusion of feelings in the domain of science” (Hanslick 2018, lxxxiv). In chapter 1, Hanslick makes the same move when it comes to musical arousal: he does not want to “underestimate” the “strong feelings that music awakens from their slumber,” but merely refutes the “unscientific assessment of these facts for aesthetic principles” (Hanslick 2018, 9). For Payzant, Hanslick accepts music’s capacity to arouse, express, or portray emotion; he only “says that to do so is not the defining purpose of music” (Hanslick 1986, xvi). Stephen Davies and Peter Kivy, who in 1980 concurrently established a concept of musical emotion based chiefly on the dynamic features of musical structure that readily suggest the outward features of expressive behavior (Trivedi 2011), regarded Hanslick as a historical precursor to their shared model of enhanced formalism. The crucial disparity between enhanced formalism and Hanslick’s aesthetics, both authors hold, is that they conceive of expressive properties as objective musical properties, whereas Hanslick was reluctant to take this step (Davies 1994, 204; Kivy 2009, 64). Based on numerous passages of OMB that suggest music’s ability to be “itself intellectually stimulating and soulful” and that show how music alone “absorbs” its creator’s feelings (Hanslick 2018, 45–46, 65), this view has been called into question. As Hanslick locates emotive meaning in music’s kinetic features that replicate the dynamic properties of affective conditions, his stance might come close to enhanced formalism (Cook 2001, 175). In view of Hanslick’s account of musical emotion as “silhouettes” (Hanslick 2018, 27) that open a certain variety of possible meaning whilst precluding capricious readings of music, he seems to regard musical elements as indefinitely expressive (Srećković 2014, 131)—an approach that anticipates Susanne K. Langer’s (1895–1985) theory of music as an “unconsummated symbol” (Wilfing 2016, 26–29).

d. The Musically Beautiful and Music’s Relation to History

Hanslick’s arguments regarding the complex relations between emotions and music, the indeterminate expressivity of musical gestures, as well as their debatable relevance for scientific aesthetics, however, merely apply to “pure” music. As vocal music forms an amalgam of music and poetry, the emotions aroused by it cannot be ascribed to any of its codependent components in arbitrary isolation. Thus, “pure” music—instrumental compositions without a literary program, title, or text—forms the basis of Hanslick’s aesthetics (Hanslick 2018, 23–26). This lopsided approach has led scholars to assume that Hanslick regarded vocal music as an impure blending of “absolute” art forms, whilst considering instrumental music to be the ideal form of music (Alperson 2004, 260; Gracyk, chap. 1). By contrast, other scholars stressed Hanslick’s statement that any leaning towards a specific subclass of music proves to be an “unscientific procedure” (Hanslick 2018, 24), and thus read Hanslick’s favoritism as a methodological consideration without normative implications (Bonds 2014, 12; Grey 2014, 44). For Hanslick, musical beauty is never based on the literary meaning or the emotive features of music but is rather found “solely in the tones and their artistic connection”: “The content of music,” as he famously proclaims, “is sonically moved forms” (Hanslick 2018, 40–41). The purport of Hanslick’s notorious sentence has evoked a wide array of possible readings. Although the “forms” he speaks about have been interpreted occasionally to refer to large-scale forms (concerto, sonata, rondo, and so on) and have thus been translated in the singular (Dahlhaus 1989, 130; Karnes 2008, 30), it seems likely that this term actually denotes musical elements and their structural conjunction (Wilfing 2018, sec. 3.3). In contrast, sonically or “tonally” (tönend), as Payzant renders this term (Hanslick 1986, 29), is an unclear concept that has been explained divisively. Whereas Payzant takes this term to refer to “tone” as part of the diatonic musical scale (2002, 44–46), Landerer and Rothfarb translate tönend as “sonically” and therefore emphasize its auditory features. Much of the question whether Hanslick perceived “pure” music to be captured entirely in the score itself (Subotnik 1991, 279; Alperson 2004, 266) or to require an auditory experience to be appreciated aesthetically (Bujić 1988, 10; Hamilton 2007, 82) hinges on the problematic translation of tönend.

Hanslick, however, willingly concedes that an assertive definition of the musically beautiful is virtually impossible to achieve because “pure” music cannot express concrete meaning. Any account of music’s content thereby amounts to “dry technical specifications” or “poetic fictions” (Hanslick 2018, 43). Music, in each case, must be understood musically and can be grasped only from within, as no verbal report can suffice. If we want to specify the content of a given theme for another person, “we have to play the theme itself for him” (Hanslick 2018, 113). Although Hanslick is unable to provide an exhaustive definition of musical beauty, he guards against potential fallacies: For him, the musically beautiful represents more than symmetry, regularity, proportion (Hanslick 2018, 57–59), or a pleasant sequence of tones, as these images neglect the crucial aspect of beauty: Geist (mind or intellect). The forms music consists of are “not empty but rather filled, not mere borders in a vacuum but rather intellect shaping itself from within” (Hanslick 2018, 43). Consequently, the act of composition is an “operation of the intellect in material of intellectual capacity” and the musically beautiful is produced primarily by the “intellectual power and individuality” of the composer’s imagination that has been absorbed by musical structure as a tonal idea that “pleases us in itself” (Hanslick 2018, 45–46). “Pure” music, Hanslick contends, has its own logic based on purely musical factors, the effect of which is governed by certain natural laws that have to be discovered, examined, and elucidated by aesthetic analysis (Hanslick 2018, 47–50). At this point, the tentative character of Hanslick’s approach becomes apparent, as he does not give any substantial indication as to how this goal could be realized beyond the idea that we must observe the efficacy of musical elements that are then reduced to general aesthetic categories that in turn lead to an ultimate principle. Although Hanslick cannot provide a conclusive treatment for scientific aesthetics, the pivotal insight of OMB seems clear: musical beauty depends on musical material and not on any concept or emotion. Thus, Hanslick wonders whether the divergent aesthetic qualities of musical artworks might hinge on the gradation or accuracy of emotional expression and answers in the negative: A piece shows more aesthetic qualities than another simply because it contains “more beautiful tone forms” (Hanslick 2018, 51).

Here, Hanslick mentions one of the few concrete examples of musical beauty by declaring creativity, originality, and spontaneity to be essential features of musical prowess. This view is notable because Hanslick’s notion of how musical beauty relates to history is one of the most divisive aspects of OMB. Hanslick’s emphasis on the intrinsic qualities of “pure” music, ruling out the various settings of creation, listening, or performance for aesthetic concerns, has led scholars to assume that Hanslick treats beauty ahistorically (Burford 2006, 172–73; Karnes 2008, 50–52; Bonds 2014, 176–77). This view is often based on Hanslick’s assurance that his concept of beauty applies to classicism as well as romanticism and thereby pertains to “every style in the same way, even in the most opposed ones” (Hanslick 2018, 55). Hanslick moreover advocates a categorical separation between historical reasoning and aesthetic judgment: whereas the historian’s exploration of the broader context of a given piece is undeniably warranted, aesthetic inquiry hears “only what the artwork itself articulates.” In regard to this hierarchy between the aesthetic relevance of artwork and context, Hanslick somewhat anticipates the New Criticism of 20th-century literary studies principally associated with Monroe C. Beardsley and William K. Wimsatt (Appelqvist 2010–11, 77–78). However, this idea is undermined immediately by Hanslick’s remarks on the indisputable connection of artworks to “the ideas and events of the time that produced them.” As music is created by an intellect, it stands in inextricable interrelation with concurrent productions of art and the “poetic, social, scientific conditions” of its time and place (Hanslick 2018, 55–56). For Hanslick, the aesthetic qualities of musical elements (particular cadences, intervallic progressions, modulations, and so on) are subject to historic decline and “wear out in fifty, even thirty years.” Eternal musical beauty is “little more than a nice turn of phrase” and we may say of compositions that “rank high above the norm of their time that they were once beautiful” (Hanslick 2018, 51, 58n). This theoretical contradiction prompted scholars to discern between Hanslick’s principle of scientific aesthetics, which is established ahistorically, and his concept of music itself and particular instances of the musically beautiful, which are subject to change (Landerer and Zangwill 2016, 490–92; Wilfing 2016, 17–18).

e. Listening, Music’s Relation to Nature, and Music’s Content

Although Hanslick openly rejects the listener’s relevance for the constitution of the musically beautiful that exists apart from the listener’s perception, the subjective impression of music forms the topic of chapters 4 and 5 of OMB. Hanslick is not at all interested in establishing a purely intellectual apprehension of musical structure. Beauty is rooted in (physical) sensation and engages the faculty of imagination as an intermediary between sensation, intellect, and feeling: listening to music in a purely rational fashion, Hanslick contends, is as far removed from aesthetic appraisal as mere affective arousal. The musical artwork acts as an “effective median between two animated forces,” the composer and the listener. The aesthetic exaltation of the composer’s imagination yields a theme shaped by the composer’s individuality, which is subsequently elaborated according to the artistic talents of its creator (Hanslick 2018, 63–64). The composer’s personality molds music’s “infinite capacity for expression” through his “consistent preference for certain keys, rhythms, [and] transitions” that transform the composer’s sensibility into a part of objective musical structure, which in turn is open to the listener’s perception (Hanslick 2018, 65). The listener’s judgment about the concrete meaning of a given piece is therefore affected heavily by performance, which allows the artist to release directly the emotion apparently perceived in music (Hanslick 2018, 67–69). For Hanslick, the genuine affective reaction of the listener, especially powerful in the case of music, is beyond dispute, but the ways in which it is constituted varies considerably. If the listener’s approach to “pure” music involves the attentive tracking of compositional development and therefore transcends emotional indulgence, the approach is aesthetical (Hanslick 2018, 88–90). If the emotive impact of music is received passively, however, the listener’s attitude is regarded as “pathological”—a term that carries medical connotations but derives chiefly from the Greek notion of “pathos,” thereby denoting purely passive experience (Hanslick 2018, 81–88). For Hanslick, this mode of listening originates from the physical aspects of sound and its direct effect on the human nervous system and thus lacks the necessary component of Geist to be considered aesthetical. It actually belongs to physiological, psychological, or medical research and is not subject to aesthetic inquiry (Hanslick 2018, 71–80).

Hanslick’s analysis of the complex interplay between composer, artwork, and listener is followed by an investigation of music’s relation to nature, arguably the oldest chapter of OMB. In general, artworks present a twofold relation to nature: first, through their physical material (sound, paint, stone); second, through the content nature affords to art. In the case of “pure” music, considered a cultural artefact, the physical material provided by nature merely amounts to “material for material” (wood, hide, hair) that is used to create actual musical material (tones, intervals, scales), already a product of culture (Hanslick 2018, 95). Nature thus merely offers physical material for acoustic material that in turn provides material for the creative activity of the individual composer, which builds upon the collective repository of music history. As musical content consists entirely of musical features, the origins of which are not natural, Hanslick moreover postulates that nature cannot provide content for “pure” music and thus does not have any relation to musical artworks. Whereas sculptors, painters, and writers are able to draw inspiration from human actions or nature itself, music finds no preceding prototype beyond the history of “artificial” musical material and is thus only akin to architecture. In blatant contrast to mimetic concepts of art, Hanslick thus holds that “the composer cannot transform anything, he has to newly create everything” (Hanslick 2018, 103). At this point, Hanslick once more illustrates the historical evolution of musical material, emerging gradually as a creation of intellect, by noting how certain modern intervals “had to be achieved individually” over multiple centuries. Music itself, in each of its various aspects, is created entirely by intellectual ingenuity and represents a “consequence of the endlessly disseminated musical culture.” Hanslick therefore overtly advises to “beware of the confusion as though this (present) tone system itself necessarily lies in nature” (Hanslick 2018, 95–97). As Hanslick’s concept of scientific aesthetics is based on material features of musical structure, this view has significant implications for his entire stance: since musical material will constantly undergo extension, any alteration pertaining to crucial aspects of musical technique will also affect the basics of aesthetic research (Hanslick 2018, 98–99).

Finally, Hanslick revisits the question of musical content in order to differentiate meticulously between distinct concepts of content usually lumped together indiscriminately. Content is defined as that “what something contains, holds within itself.” In the case of music, “content” denotes the tones and forms a piece of music is made of. This term is not to be confused with “subject matter” that typically indicates abstract literary content of which music has none: “music speaks not merely through tones, it speaks only tones” (Hanslick 2018, 108–109). In music, the concepts of content and form—musical material and its artistic design—mutually determine each other and are ultimately inseparable: “With music, there is no content opposed to form, because it has no form outside of the content” (Hanslick 2018, 111–12). A separation between musical content and its form does merely pertain to cases in which form is applied to large-scale structures, which is not the standard meaning of this term in OMB. Only then can the theme be called content, whereas the overall structure, the “architectonic of the joined individual components and groups of which the piece of music consists,” acts as form. The theme, which “develops in an organically, clearly organized, gradual manner, like luxuriant blossoms from a single bud,” constitutes the irreducible aesthetic “essence” of a piece of music. As everything in a specific musical structure is a “spontaneous consequence” of the initial theme, the multitude of prospects in which a theme could be developed determines its aesthetic substance or Gehalt: “whatever does not reside in the theme (overtly or covertly) cannot subsequently be organically developed” (Hanslick 2018, 113–14). Even though music does thus not present subject matter along the lines of literary meaning, “pure” music, animated by “thoughts and feelings,” does clearly exhibit intellectual “substance.” Generally speaking, “pure” music has content: purely musical content manifest in the distinct musical features of the theme, which Hanslick describes poetically as “spark of divine fire.” Musical content, Hanslick emphasizes in conclusion, purely derives from the “definite beautiful tone configuration” of a given piece as the “spontaneous creation of the intellect out of material of intellectual capacity” (Hanslick 2018, 114–16).

f. Conclusion: The Curious Nature of Hanslick’s Formalism

Hanslick’s aesthetics is frequently considered the “classical definition of formalistic aesthetics in music” (Yoshida 2001, 179) and the “inaugural text in the founding of musical formalism as a position in the philosophy of art” (Kivy 2009, 53). What is meant by musical formalism and which exact version of musical formalism Hanslick is supposed to represent, however, is one of the divisive questions of Hanslick scholarship and of the philosophy of music at large. The conceptual significance of the term ‘form’ and its relevance for Hanslick’s theory seem to be overrated in principle. Philosophical commentators typically overlook that Hanslick’s definition of beauty in music—the focal point of OMB—does not rely upon any idea of form and that this term is indeed absent from Hanslick’s description of music’s artistic quality: by specific musical beauty, Hanslick designates a “beauty that is independent and not in need of an external content, something that resides solely in the tones and their artistic connection” (Hanslick 2018, 40). Furthermore, Hanslick’s infamous statement of “sonically moved forms” did not correspond to music itself, as is surmised regularly, but much more narrowly to music’s content that is thereby equated with form, and vice versa. Even the more pointed version in the second edition of OMB, which states that forms are “solely and exclusively the content and subject of music” did not identify form with music itself but rather claims the identity of the content and forms of music (Hanslick 2018, 41). Thus, these forms are not without content or thought of as empty but rather are imbued by intellect (Geist) “shaping itself from within” (Hanslick 2018, 43), thereby linking beauty to mental activity (Bowman 1991, 47; Paddison 2002, 335; Burford 2006, 179). Hanslick therefore opposes one of the central claims of formalist aesthetics that usually stresses the primacy of formal features over some kind of content (Fisher 1993, 250; Kivy 2002, 67; Beard and Gloag 2005, 65). In music, he states, “we see content and form, material and design, image and idea fused in an obscure, indivisible unity,” which means that “there is no content opposed to form” as music “has no form outside of the content” (Hanslick 2018, 111–12). By stating that form and content are one, Hanslick is “almost alone among formalists” (Payzant 2002, 83) and OMB thus even “reads more like a traditional criticism of formalism” (Hamilton 2007, 88).

Whether Hanslick’s aesthetics is to be regarded as formalist, however, depends entirely on the definition of formalism espoused by scholars. The special variety of Hanslick’s approach is clarified by one of the customary definitions of formalist aesthetics, the conception of formalism as common denominator argument (Carroll 1999, chap. 3 and 2001). In this case, formalism is understood as a universal definition of art, such as in Clive Bell’s (1881–1964) formalist manifesto Art, which posits a circular concept (Gardner 1996, 238; Carroll 2001, 95; Stecker 2003, 141) of “aesthetic emotion” elicited by “significant form” that “distinguishes works of art from all other classes of objects” and thereby defines the fine arts as such (Bell 1914, 13). Formalists, as Dziemidok (1993, 192) states, “strive to determine general criteria of valuation universally applicable to all art forms” and thus miss the “values unique” to each artistic medium by commencing with “universalistic assumptions.” As we have seen in sec. 3.b, this definition of formalism contradicts Hanslick’s insistence on the idea that the criteria of the musically beautiful apply solely to music itself and not to the other art forms. Further concepts of general aesthetic formalism prove to be similarly debatable: Small (1998, 135), for example, describes formalist theories as denying that “emotions have anything to do with the proper appreciation of music” (form versus emotion/content), while Mothersill (1984, 222) emphasizes formalism’s conviction that “elements which suggest or establish a link between the artwork and the world should be disregarded” (form versus context). In view of OMB, both ideas seem somewhat applicable but at the same time miss something important about Hanslick’s viewpoint: whereas aesthetic analysis—conceived as an objectivist scientific approach—is indeed distinct from historical concerns and the stimulation, expression, or portrayal of definite emotion, music itself affects emotion and is connected intimately to concurrent productions of art and the “poetic, social, scientific conditions” of its time and place (Hanslick 2018, 9, 55; cf. Wilfing 2016, 15–18). In general, any detailed appraisal of Hanslick’s formalism does hinge upon the individual definition of aesthetic formalism and ‘form’ itself—a term that is as ambiguous as it is persistent (Tatarkiewicz 1973, 216), which might be of limited efficacy in describing Hanslick’s argument and must thus be employed carefully (Nattiez 1990, 109; Bowman 1991, 53; Payzant 2002, 58).

4. The Intellectual Background of Hanslick’s Aesthetics

a. Hanslick and German Idealism

Historical research on OMB is dominated primarily by questions of intellectual dependency: Who influenced Hanslick’s aesthetic approach and which philosophical movement stimulated the main ideas of his aesthetic approach (Landerer and Wilfing 2018)? Numerous candidates have been invoked as precursors to Hanslick’s “formalism,” ranging from idealist theorists—Kant (1724–1804), Herder (1744–1803), Hegel (1770–1831), Schelling (1775–1854), Vischer (1807–87)—and German poetry—Lessing (1729–81), Goethe (1749–1832), Schiller (1759–1805), or the German literary romantics—to the Austrian context of Hanslick’s aesthetics and “minor” figures such as Michaelis (1770–1834), Novalis (1772–1801), or Nägeli (1773–1836). Generally speaking, current scholarship situates Hanslick’s argument in the (ultimately antithetic) traditions of German idealism and Austrian realism. The most prominent contender as the crucial source of OMB, emphasized particularly in analytical philosophy (Gracyk, chap. 1; Appelqvist 2010–11, 76; Davies 2011b, 297), is Kant’s Kritik der Urteilskraft (Critique of the Power of Judgment, 1790). As OMB is typically regarded as the classical definition of formalistic aesthetics in music and Kant’s Kritik is widely thought to be the origin of general aesthetic formalism, this link appears entirely natural (Ginsborg 2011, 334). Their respective definition of aesthetic intuition as disinterested contemplation, standing apart from rational thought and affect states, as well as their general concept of beauty, which is not subject to an external purpose or definite concepts, establish Hanslick’s awareness of Kant’s theory. Whether Hanslick, who did not receive any formal training in philosophy, ever read Kant or whether he adopted certain notions from post-Kantian aesthetic discourse (Dambeck, Michaelis, Nägeli, and so forth) is open to debate. Although Hanslick’s reliance on Kant’s theory is frequently accepted as fact, this view is complicated by at least three issues: (1) Kant’s notion of music as a servant of poetry and as a language of affect states was criticized vigorously by Hanslick. (2) Hanslick’s concept of specific musical beauty directly opposes Kant’s idealist attitude, which stipulates an abstract principle of beauty, administered retroactively to each art form. (3) The objectivist approach of Hanslick’s aesthetics contradicts Kant’s transcendental methodology, the crucial premise of his entire system (Bonds 2014, 188–89; Wilfing 2018, sec. 3.3).

While Kant is mentioned only once in OMB as one of those “eminent people” who did reject any literary content when it came to music (Hanslick 2018, 107), a different contender as the pivotal source of Hanslick’s aesthetics is referred to on multiple occasions: Hegel. Although a large share of Hanslick’s comments on Hegel are intended as criticism—he accuses Hegelian theories of an “underevaluation” of sensuousness in favor of ideas, for example (Hanslick 2018, 42)—various quotes and his early music reviews confirm that Hanslick was familiar with Hegel’s aesthetic positions. The theoretical importance of Hegel’s Vorlesungen über die Ästhetik (Lectures on Aesthetics, 1835–38) for the basic tenets of Hanslick’s approach have been investigated particularly by Carl Dahlhaus, who supported his viewpoint by drawing attention to Hanslick’s persistent utilization of the term Geist, which also permeates Hegel’s philosophy. Dahlhaus, however, did not regard Hanslick’s treatise as an uncritical extension of Hegel’s theory of art as the corporeal incarnation of the idea, in which music itself is only form, whereas thoughts and feelings are the content (Dahlhaus 1989, 110). For him, Hanslick’s theory inverts Hegel’s system by making the idea purely musical and thereby turning “form” into a concept of the interior, not the exterior (Burford 2006, 170; Bonds 2012, 8). Although Hanslick’s definition of composing as “intellect shaping itself from within” is probably situated in a general setting of Hegelian reasoning, the whole extent of Hanslick’s awareness of Hegel’s writings is unknown, as no related records survive. The situation is different, however, if we turn to Hegelian aesthetic theorists: We know that he read parts of Vischer’s Aesthetik oder Wissenschaft des Schönen (Aesthetics or Science of Beauty, 1846–57), for example, which might have been the most likely source for his Hegelian leanings (Titus 2008). Hanslick candidly criticized Hegelian aesthetics for its historical orientation, which seemingly confused historical research with aesthetic analysis, but he nonetheless emphasized the historical evolution of musical material and the arbitrary appraisal of specific artworks. The idea that artistic material does not merely consist of physical elements (sound, paint, stone), but moreover comprises the entire historical evolution of each art form—the historical interplay between material and mind—was a central concept of Vischer’s theory, linking Hanslick’s approach to Hegelian aesthetics.

b. Hanslick and Austrian Realism

As an Austrian theorist raised in Prague who spent most of his career in Vienna, the delineated relevance of German idealism for the basic tenets of OMB has to be supplemented by an analysis of Hanslick’s Austrian contexts. In the 19th century, Austrian science policies were strongly opposed to philosophical “speculation” that was held responsible for the societal upheaval in the wake of 1789 and 1848. These events caused several reforms of the Austrian school system, the primary purpose of which should be to foster the restoration endeavors of the Habsburg leadership by confining education to propaedeutic instructions compatible with Catholic dogmas and state norms. This political strategy resulted in the preservation of Leibnizian philosophy, the flourishing of positivistic scholarship, and the inhibition of German idealism in favor of methods perceived as decidedly scientific. One intellectual, who consciously modernized the Leibnizian framework engrained in the academic landscape of Austria, was the Prague priest and philosopher Bernard Bolzano (1781–1848). Although Bolzano was forced to resign owing to an unfounded accusation of Kantianism in 1819, the general precepts of his writings prospered in Habsburg territories by way of his scientific successor and Hanslick’s close friend Robert Zimmermann (1824–98), who attained a tenured position at the University of Vienna in 1861. Bolzano published his aesthetic doctrines in Über den Begriff des Schönen (On the Notion of Beauty, 1843) and Über die Eintheilung der schönen Künste (On the Classification of the Fine Arts, 1849). In similar fashion to Hanslick, he defined aesthetic perception as disinterested contemplation, construed musical listening as an intentional monitoring of compositional development, and dismissed emotivist models whilst insisting on particular aesthetics for each art form. Bolzano’s most significant contribution to Hanslick’s aesthetics, however, was his drastically objectivist approach isolated entirely from psychological explanations that might derive from Bolzano’s theory of science. Here, Bolzano outlines his Platonic concept of a “truth as such,” which states something as is, no matter whether this fact has been or ever will be uttered or thought by anyone. The radically objective condition of Hanslick’s concept of musical beauty, which remains beauty “even if it is neither viewed nor contemplated,” matches Bolzano’s Platonic mindset (Bonds 2014, 162; Wilfing 2018, sec. 2).

Another important precursor to Hanslick’s aesthetics, who is significant particularly due to his influence on Austrian science policies in general, is Johann Friedrich Herbart (1776–1841). As Herbart declared natural science the operational benchmark for philosophy and demanded a separation between philosophy, religion, and politics, his approach blended perfectly with the positivistic endeavors of Habsburg authorities and thereby became the semi-official philosophy of Austria. This gradual process was completed by the school reform of 1849, the leading figures of which closely adhered to Herbartian teachings (Landerer and Wilfing 2018, sec. 4), including Zimmermann, Hanslick’s former teacher Franz Exner (1802–53) and his old associate Joseph von Helfert (1820–1910). Hanslick, who attained a position at the ministry of education in 1854, recognized the importance of employing Herbartian principles in OMB, which should set the stage for his academic profession (Payzant 2002, 131). It thus comes as no surprise that Hanslick declared himself a follower of Herbart in his successful habilitation petition of 1856. As recent studies demonstrated convincingly, however, this personal testimony is probably nothing more than an allusion provoked by careerist concerns (Karnes 2008, 31–34; Bonds 2014, 159; Landerer and Zangwill 2016, 90–91). An immediate reference to Herbart is totally absent from earlier editions of OMB, where he is belatedly included in the third edition of 1865 and the sixth edition of 1881 (Hanslick 1986, 77, 85). In spite of this lack of quotes and in view of Herbart’s bearing on Austrian science policies, it is difficult to imagine that Hanslick was completely unfamiliar with Herbart’s ideas prior to the initial edition of 1854. In regard to Hanslick’s argument, Herbartian teachings seem to be important specifically for his formalist approach, for his theory of autonomous instrumental music, for his refutation of emotivist aesthetics, for his emphasis on elemental components of “pure” music and their mutual relations, and for his appreciation of technical musical analysis (Bujić 1988, 7–8; Bonds 2014, 158–62; Wilfing 2018, sec. 2). Generally speaking, the writings of Bolzano and Herbart were similar in various respects—a fact that lead to the frequent blending of their work in post-1848 Austria. Specific features of OMB, however, are decidedly Herbartian, such as Hanslick’s concept of emotion deriving from Herbart’s cognitivist reductionism that regards feelings as a subclass of Vorstellungen or presentations (Landerer and Wilfing 2018, 49n).

c. Editorial Problems and Eclectic Origins of OMB

The Austrian contexts of Hanslick’s aesthetics were supremely important for the contentual alterations following the initial edition of OMB (Landerer and Wilfing 2018, sec. 4). The most striking example of these severe changes, owing to the scientific landscape of contemporary Austria, is the removed final paragraph of Hanslick’s classic treatise. OMB originally concluded in idealist fashion, linking the musically beautiful with “all other great and beautiful ideas.” As “pure” music ultimately represents a sounding portrayal of the motions of the cosmos, it eventually transcends its conceptual limitations, “allowing us to feel… the infinite in works of human talent.” The vital traits of musical structure (harmony, rhythm, sound), Hanslick proclaims, permeate the universe so that one can “find anew in music the entire universe” (Bonds 2012, 4; cf. Hanslick 2018, 120). This original ending of OMB evidently betrayed remnants of German idealism and therefore countered Austrian science policies. This discrepancy was pointed out to Hanslick by the foremost Herbartian philosopher of his time and place: Zimmermann. In an extensive review, published in 1854, he commended the positivistic orientation of Hanslick’s argument that apparently conformed to Herbartian aesthetics, but at the same time criticized the idealist notions present in OMB. According to Zimmermann, the idea that the musically beautiful is completely autonomous epitomized the crucial insight of Hanslick’s argument. For him, this advantage of Hanslick’s aesthetics was compromised by his concession to an aesthetics dependent on speculative metaphysics (Bonds 2012, 5–6). As this public review outlined the Herbartian sentiments of Habsburg authorities responsible for his future career, Hanslick deleted the closing remarks as well as additional passages evocative of his former idealist stances (Landerer and Zangwill 2016; Sousa 2017). It is for this reason that the historical reception of OMB in anglophone scholarship was impacted markedly by Hanslick’s alterations: whereas German-language discourse is based mostly on the initial edition of OMB, its translations utilized editions 7 (Cohen), 8 (Payzant), and 10 (Rothfarb and Landerer) that read more formalistic and positivistic than earlier versions. As the deleted ending of OMB was translated for the first time as late as 1988 (Bujić 1988, 39) and was not discussed seriously by anglophone academics prior to Bonds’s studies, one can get the impression that scholarship in German and English addresses quite different books (Payzant 2002, 44).

A relevant outcome of current research into Hanslick’s intellectual background, however, is the emerging realization that Hanslick’s aesthetics draws upon a wide array of assorted aesthetic discourses integrated into OMB. It is no contradiction that Hanslick’s emphasis on structural relations between musical elements is derived from Herbartian aesthetics, whilst his concurrent refutation of psychological considerations—supremely important for Herbartian aesthetics—appears to be closer to Bolzano. The same applies to Hanslick’s Vischerian concept of historical evolution, overtly opposing the ahistorical orientation of Herbartian aesthetics, and his anti-Hegelian insistence on a categorial distinction between the methods of historical and aesthetic research derived from Herbartian philosophy (Edgar 1999, 443–44; Landerer and Zangwill 2017, 93–94). Hanslick’s textual strategy frequently resembles a virtual collage as in a passage reworded for the second edition of 1858: Hanslick defends that beauty remains beauty “even when it arouses no emotions, indeed when it is neither perceived nor contemplated. Beauty is thus only for the pleasure of a perceiving subject, not generated through that subject” (Bonds 2014, 189; cf. Hanslick 2018, 4). The first part of Hanslick’s quotation is adopted directly from Zimmermann’s review and might even have an immediate antecedent in Bolzano, the former teacher of Zimmermann. Bolzano makes a similar objectivistic statement in On the Notion of Beauty by stating that beauty would remain beauty “even if there existed only one human being in the entire world or no one at all.” The first part of the second sentence, however, alludes to Vischer’s Aesthetics and his concept of Anschauung (perception), thereby directly linking the opposing approaches of Herbartianism and Hegelianism. Hanslick purposely disregards Zimmermann’s ensuing assertion that beauty is based on constant relations between aesthetic properties and thus does not change over time as he acknowledged the historical condition of music and beauty (Landerer and Wilfing 2018, sec. 3). Generally speaking, Hanslick’s argument comprises a multitude of diverse sources—which at times are blatantly antithetic—and his intellectual background is therefore difficult to reconstruct thoroughly. His “eclectic” approach, however, ensured the remarkable durability of Hanslick’s aesthetics, which was not bound by the rise and fall of isolated academic traditions (Bujić 1988, 8).

5. The Reception of Hanslick’s Aesthetics and Its Relevance to Current Discourse

a. A General Outline of Hanslick’s Reception by Austro-German Discourse

The historical reception of Hanslick’s aesthetics, stretching from Viennese Modernism, the beginnings of musicology, and numerous composers to significant philosophers such as Friedrich Nietzsche (1884–1900), Theodor W. Adorno (1903–69), Langer, and analytical aesthetics in general, for the most part represents “terra incognita” (Deaville 2013, 25). Scholarship on Hanslick’s reception is typically restricted to incidental references to conceptual similarities between Hanslick and certain later authors. OMB is mentioned by Karl Popper (1902–94), for example, and probably affected his objective aesthetic approach, his wariness regarding psychological argumentation, and his rejection of emotivism. Ludwig Wittgenstein’s (1889–1951) late work is similarly evocative of Hanslick’s approach, as he declares musical meaning to be purely musical and repudiates the idea that “pure” music could be translated adequately into other modes of expression (Ahonen 2005, 520–23; Szabados 2006, 651–53). Adorno’s adoption of Hanslick’s dynamism (Goehr 2008, 20; Paddison 2010, 131–34) and his distinction between different attitudes towards musical listening betray Hanslick’s impact as much as Adorno’s concept of the historical evolution of musical material (Edgar 1999, 441–44; Paddison 2002, 336), firmly rooted in Hegelian aesthetics. Hanslick’s influence on Nietzsche is particularly remarkable as it spans from his earliest writings to his late work. His vigorous criticism of Wagner in Der Fall Wagner (The Case of Wagner, 1888) and Nietzsche contra Wagner (1889) is inspired evidently by Hanslick’s writings, replicated virtually verbatim on numerous occasions. OMB similarly influenced young Nietzsche, who studied Hanslick’s treatise as early as 1865 and employed Hanslick’s argument in fragment 12[1] of 1871 on the relation between language and “pure” music. Here, Nietzsche verbalizes doctrines that are far more indicative of his eventual refutation of Wagner’s oeuvre than his Geburt der Tragödie (Birth of Tragedy, 1872), written at the same time, might suggest. Scholars have thus assumed a rather brief period of unwavering enthusiasm for the Bayreuth composer (Prange 2011). No philosophical movement, however, has addressed Hanslick’s aesthetics as fruitfully as analytical philosophy, particularly so due to its strong focus on the expressive capabilities of “pure” music.

b. Hanslick’s Reception by Analytical Aesthetics and the Direct Impact of OMB

The crucial feature of analytical philosophy is its methodic scientism as the foundation for all philosophy and all knowledge acquisition in general. Current research into the key attributes of analytical aesthetics regularly highlights its tendency to detach the targets of analysis from various contexts in order to establish the possibility of objective observation (Roholt 2017, 50–51). Hanslick’s positivist approach targeted towards scientific objectivity, his strong appeal to natural science as a guideline for objective aesthetics, and his procedural dissociation of musical artworks from external contexts that are not relevant for aesthetic purposes concurs with this provisional description of analytical philosophy of music. Historically, Hanslick’s aesthetics was perceived as an important corrective to the “fantastic nonsense” and “sentimental speculations” of idealist theories (Lang 1941, 978; Epperson 1967, 109–10) and therefore contributed to the anti-idealist movement of analytical philosophy aimed against Hegelians such as Francis Bradley (1846–1924), Bernard Bosanquet (1848–1923), or John McTaggart (1866–1925). Early analytical aesthetics of the 1950s and 1960s, which initially needed to cast off its widespread reputation of conducting unscientific guesswork, was concerned principally with abstract problems and attempted to determine an exhaustive definition of art, the quality and quantity of aesthetic properties, and the peculiarity of aesthetic perception (Goehr 1993; Lamarque 2000). Even though this focal point of anglophone philosophy left no room for OMB and its emphasis on musical artworks, Hanslick’s treatise gained traction the moment aesthetics redirected its inquiry towards more concrete subjects. Works on issues related to music, increasing strikingly in the 1980s (Lamarque 2000, 14; Davies 2003, 489), proceeded from influential publications by Budd, Davies, and Kivy (all 1980) that featured Hanslick’s aesthetics markedly and set the scene for ensuing decades of anglophone philosophy of music (Davies 2011b, 294). Each of their texts is focused on problems of musical expression and drew from Hanslick’s cognitive concept of emotion, resembling the approach developed by Stanley Schachter and Jerome Singer in the 1960s. Thus, the development of aesthetics concerned with specific objects and the establishment of cognitivist psychology coincide with and form the basis of Hanslick’s fruitful reception by analytical aesthetics.

Hanslick’s theories, the impact of which has even been compared to David Hume’s (1711–76) historic critique of speculative philosophy (Hanslick 1957, vii), shaped the general position on musical meaning in anglophone philosophy. Even though hardly any current approach concurs entirely with Hanslick’s aesthetics (Zangwill 2004 is a prominent exception), his momentous formulation of certain issues continues to dominate aesthetic discourse (Maus 1992, 273; Davies 2003, 492; Hamilton 2007, 82). This fact is exemplified particularly by authors who discard OMB and its cognitivist orientation, but nonetheless acknowledge that his views are permeating anglophone philosophy (Madell 2002, 1–9). His cognitive hypothesis, however, was not the only argument espoused by analytical academics, who also drew from more specific aspects of OMB. Hanslick’s rejection of basic forms of musical expression, treating affective features as a direct result of the composer’s emotional condition (Hanslick 2018, 63–65), for example, is basically accepted by modern research (Kivy 1980, 14–15; Davies 1986, 148; Naar, chap. 3b). Hanslick justifies this view with the theoretical redundancy of an aesthetic approach that traces the cause of emotional expression to a source located outside of art. Musical expression is successful principally in virtue of the expressive properties of music chosen to indicate a specific feeling and cannot be explained by reference to the artist’s affect states, already absorbed by his creation (Kivy 2009, 250; Davies 2011a, 23; Gracyk 2013, 78–79). Another argument aimed against arousal theories that has been discussed frequently by anglophone philosophers, and that was coined mainly by Budd (1985, 125), is the “heresy of the separable experience” (Ridley 1995, 38–49; Scruton 1997, 145–46; Madell 2002, 32, 57, 99). If musical expression is dependent on the response of the listener, music might become nothing more than a random medium of transference, which could be replaced by objects causing an identical response, and loses sight of the individuality of the composition (Hanslick 2018, 91–92). Hanslick proposes that causal theories cannot explain the unique quality of musical artworks as they tend to regard music as a device for affective arousal that could just as well be realized by a warm bath, a cigar, chloroform (Hanslick 2018, 83), or by a drug causing feelings (Kivy 1989, 218, 222, 242; Matravers 1998, 169–85; Robinson 2005, 351, 393, 397).

c. Bypassing Hanslick’s Cognitivist Arguments: Kivy, Davies, and Moods

As we have seen, important objections directed against current theories of musical arousal and expression propounded by anglophone philosophers stem from Hanslick’s aesthetics and extend beyond the cognitivist hypothesis of OMB. His cognitivism is therefore frequently considered the strongest argument that emotivist aesthetics has substantial weaknesses (Kivy 1989, 157; Davies 1994, 209). Hanslick’s (implicit) concept of indeterminate expressivity (Wilfing 2016, 26–29) suggests that emotion is an inherent property of musical structure—an idea that laid the ground for the enhanced formalism of Davies and Kivy, which is based on the similarity perceived between musical motions and the outward features of human emotion. Enhanced formalism does not hold that music refers beyond itself to occurrent emotions but considers expression an objective property of musical structure: music itself is the owner of the emotion it expresses (Davies 1980, 68; Kivy 1980, 64–66). Hanslick, however, had good reasons to abandon enhanced formalism as the theoretical foundation of scientific aesthetics—reasons that paved the way for another argument crucial to analytical aesthetics: the argument from disagreement (Gardner 1996, 245–46; Sharpe 2004, 19–20). While Davies (1994, 213–15) and Kivy (1990, 175–77) fully agree that “pure” music cannot express Platonic attitudes (emotions such as pride or shame that involve complex concepts), they hold that it is able to portray definite emotional properties of a lower order. Hanslick’s attitude is even more skeptical: As the dynamic character of affect states is only one moment of emotion, not emotion itself, music can merely allude to a certain variety of affect states, not to any sentiment in particular, and any survey among an audience regarding the emotion ascribed to a piece would thus yield varied results (Hanslick 2018, 23). As enhanced formalism is based on the semblance perceived between musical motion and emotive behavior, Davies and Kivy needed to dismiss Hanslick’s claim about considerable disagreement by gradually retreating to more and more general emotions, which serve as umbrella concepts for specific emotions (Kivy 1980, 46–48; Davies 1994, 246–52). Other scholars pointed to Hanslick’s metaphor of expressive silhouettes and construed his argument in terms of indeterminate expressivity along the lines of Rorschach’s inkblot testing, thereby updating Hanslick’s argument for modern debates (Ahonen 2007, 93).

Generally speaking, OMB introduced numerous important arguments to analytical aesthetics that remain the subjects of current research, such as the famous paradox of negative emotion, which Hanslick directed against theories of musical arousal. If every death march or every somber adagio, Hanslick declares, had the power to elicit grief in the listener, nobody would bother with such works (Hanslick 2018, 90–91). Solutions to Hanslick’s question vary from the rejection of emotive arousal (Kivy 1989, 234–59) and accounts of the way negative emotions have beneficial pedagogic effects (Levinson 1982; Davies 1994, 307–20; Ridley 1995, chap. 7) to revised arousal theories that hold that emotional reactions to music rarely mirror the feeling depicted by a given piece (thus, a somber adagio could arouse compassion instead of sorrow; Matravers 1991 and 1998, chap. 8). Finally, Hanslick’s cognitivist formalism has contributed to a noticeable reframing of the general approach to emotive musical meaning. Matravers, for example, asserted that a piece of music would depict a specific emotion if it arouses a feeling, the physiological components of which would correspond to the emotion depicted (Matravers 1998, 149). As music cannot portray the cognitive elements of genuine emotions, Hanslick’s argument is bypassed by an appeal to feeling as the somatic feature of emotion, which music is able to prompt directly (Matravers 1991, 328). Ridley, who endorses Hanslick’s cognitive objection to common arousal theories, shares this idea by considering “objectless passions” as feelings, the gestural character of which is evoked by the dynamic qualities of music (Ridley 1995). Thus, OMB and its cognitivist orientation occasioned a shift from issues of emotional expression to issues of music’s relation to non-cognitive affect states—a shift also made clear by an increased discussion on music and moods (Radford 1991; Carroll 2003; Sizer 2007). Although OMB has thus come under attack in anglophone philosophy, the constant rebuttal of Hanslick’s aesthetics at the same time illustrates the degree to which his approach is ingrained in analytical philosophy in regard to questions of musical meaning. The lion’s share of theorists continues to consider Hanslick’s cognitive argument to be accurate in principle and adjusts their models of expressivity accordingly. Hanslick’s influence on current debates thus goes beyond the assenting reception of OMB and thereby remains equally present in modern theories intentionally sidestepping the key argument of Hanslick’s approach.

6. References and Further Reading

a. Primary Sources

  • Hanslick, Eduard. 1950. Music Criticisms, 1846–1899. Translated by Henry Pleasants. Harmondsworth: Penguin Books.
  • Hanslick, Eduard. 1957. The Beautiful in Music: A Contribution to the Revisal of Musical Aesthetics. Edited by Morris Weitz. Translated by Gustav Cohen. Indianapolis: Bobbs-Merrill.
  • Hanslick, Eduard. 1986. On the Musically Beautiful: A Contribution Towards the Revision of the Aesthetics of Music. Translated by Geoffrey Payzant. Indianapolis: Hackett.
  • Hanslick, Eduard. 1993. Sämtliche Schriften: Historisch-kritische Ausgabe. Vol. 1, Aufsätze und Rezensionen 1844–1848. Edited by Dietmar Strauß. Vienna: Böhlau.
  • Hanslick, Eduard. 1994. Sämtliche Schriften: Historisch-kritische Ausgabe. Vol. 2, Aufsätze und Rezensionen 1849–1854. Edited by Dietmar Strauß. Vienna: Böhlau.
  • Hanslick, Eduard. 2018. On the Musically Beautiful: A New Translation. Translated by Lee Rothfarb and Christoph Landerer. Oxford: Oxford University Press.

b. Secondary Sources

  • Ahonen, Hanne. 2005. “Wittgenstein and the Conditions of Musical Communication.” Philosophy 80: 513–29.
  • Ahonen, Hanne. 2007. “Wittgenstein and the Conditions of Musical Communication.” PhD diss., University of Columbia.
  • Alperson, Philip. 1984. “On Musical Improvisation.” Journal of Aesthetics and Art Criticism 43, no. 1: 17–29.
  • Alperson, Philip. 2004. “The Philosophy of Music: Formalism and Beyond.” In The Blackwell Guide to Aesthetics, edited by Peter Kivy, 254–75. Malden: Blackwell.
  • Beard, David, and Kenneth Gloag. 2005. Musicology: The Key Concepts. London: Routledge.
  • Bell, Clive. 1914. Art. London: Chatto & Windus.
  • Bonds, Mark Evan. 2012. “Aesthetic Amputations: Absolute Music and the Deleted Endings of Hanslick’s Vom Musikalisch-Schönen.” 19th-Century Music 36, no. 1: 3–23.
  • Bonds, Mark Evan. 2014. Absolute Music: The History of an Idea. Oxford: Oxford University Press.
  • Bowman, Wayne D. 1991. “The Values of Musical ‘Formalism’.” Journal of Aesthetic Education 25, no. 3 (1991): 41–59.
  • Brodbeck, David. 2007. “Dvořák’s Reception in Liberal Vienna: Language Ordinances, National Property, and the Rhetoric of ‘Deutschtum’.” Journal of the American Musicological Society 60, no. 1: 71–132.
  • Brodbeck, David. 2009. “Hanslick’s Smetana and Hanslick’s Prague.” Journal of the Royal Musical Association 134, no. 1: 1–36.
  • Brodbeck, David. 2014. Defining ‘Deutschtum’: Political Ideology, German Identity, and Music-Critical Discourse in Liberal Vienna. Oxford: Oxford University Press.
  • Budd, Malcolm. 1980. “The Repudiation of Emotion: Hanslick on Music.” British Journal of Aesthetics 20, no. 1: 29–43.
  • Budd, Malcolm. 1985. Music and the Emotions: The Philosophical Theories. London: Routledge.
  • Bujić, Bojan. 1988. Music in European Thought, 1851–1912. Cambridge: Cambridge University Press.
  • Burford, Mark. 2006. “Hanslick’s Idealist Materialism.” 19th-Century Music 30, no. 2: 166–81.
  • Carroll, Noël. 1999. Philosophy of Art: A Contemporary Introduction. London: Routledge.
  • Carroll, Noël. 2001. “Formalism.” In The Routledge Companion to Aesthetics, edited by Berys Gaut and Dominic McIver Lopes, 87–96. London: Routledge.
  • Carroll, Noël. 2003. “Art and Mood: Preliminary Notes and Conjectures.” The Monist 86, no. 4: 521–55.
  • Cook, Nicholas. 2001. “Theorizing Musical Meaning.” Music Theory Spectrum 23, no. 2: 170–95.
  • Dahlhaus, Carl. 1989. The Idea of Absolute Music. Translated by Roger Lustig. Chicago: University of Chicago Press.
  • Davies, Stephen. 1980. “The Expression of Emotion in Music.” Mind 89: 67–86.
  • Davies, Stephen. 1986. “The Expression Theory Again.” Theoria 52, no. 3: 146–67.
  • Davies, Stephen. 1994. Musical Meaning and Expression. Ithaca: Cornell University Press.
  • Davies, Stephen. 2003. “Music.” In Levinson 2003, 489–515.
  • Davies, Stephen. 2011a. Musical Understandings and Other Essays on the Philosophy of Music. Oxford: Oxford University Press.
  • Davies, Stephen. 2011b. “Analytic Philosophy and Music.” In Gracyk and Kania 2011, 294–304.
  • Deaville, James. 2013. “Negotiating the ‘Absolute’: Hanslick’s Path Through Musical History.” In Grimes, Donovan, and Marx 2013, 15–37.
  • Downes, Stephen, ed. 2014. Aesthetics of Music: Musicological Perspectives. New York: Routledge.
  • Dziemidok, Bohdan. 1993. “Artistic Formalism: Its Achievements and Weaknesses.” Journal of Aesthetics and Art Criticism 51, no. 2: 185–93.
  • Edgar, Andrew. 1999. “Adorno and Musical Analysis.” Journal of Aesthetics and Art Criticism 57, no. 4: 439–49.
  • Epperson, Gordon. 1967. The Musical Symbol: An Exploration in Aesthetics. Ames: Iowa State University Press.
  • Fisher, John Andrew. 1993. Reflecting on Art. Mountain View: Mayfield.
  • Gardner, Sebastian. 1996. “Aesthetics.” In The Blackwell Companion to Philosophy, edited by Nicholas Bunnin and E. P. Tsui-James, 229–56. Oxford: Blackwell.
  • Ginsborg, Hannah. 2011. “Kant.” In Gracyk and Kania 2011, 328–38.
  • Goehr, Lydia. 1993. “The Institutionalization of a Discipline: A Retrospective of the Journal of Aesthetics and Art Criticism and the American Society of Aesthetics, 1939–1992.” Journal of Aesthetics and Art Criticism 51, no. 2: 99–121.
  • Goehr, Lydia. 2008. Elective Affinities: Musical Essays on the History of Aesthetic Theory. New York: Columbia University Press.
  • Gooley, Dana. 2011. “Hanslick and the Institution of Criticism.” Journal of Musicology 28, no. 3: 289–324.
  • Gracyk, Theodore. 2013. On Music. London: Routledge.
  • Gracyk, Theodore and Andrew Kania, eds. 2011. The Routledge Companion to Philosophy and Music. London: Routledge.
  • Grey, Thomas S. 1995. Wagner’s Musical Prose: Texts and Contexts. Cambridge: Cambridge University Press.
  • Grey, Thomas S. 2002. “Hanslick, Eduard.” In The New Grove Dictionary of Music and Musicians, edited by Stanley Sadie, 10:827–33. London: Macmillan.
  • Grey, Thomas S. 2003. “Masters and Their Critics: Wagner, Hanslick, Beckmesser, and Die Meistersinger.” In Wagner’s “Meistersinger”: Performance, History, Representation, edited by Nicholas Vazsonyi, 165–89. Rochester: University of Rochester Press.
  • Grey, Thomas S. 2011. “Hanslick.” In Gracyk and Kania 2011, 360–70.
  • Grey, Thomas S. 2014. “Absolute Music.” In Downes 2014, 42–61.
  • Grimes, Nicole, Siobhán Donovan, and Wolfgang Marx, eds. 2013. Rethinking Hanslick: Music, Formalism, and Expression. Rochester: University of Rochester Press.
  • Hamilton, Andy. 2007. Aesthetics and Music. London: Continuum Books.
  • Karnes, Kevin. 2008. Music, Criticism, and the Challenge of History: Shaping Modern Musical Thought in Late Nineteenth-Century Vienna. Oxford: Oxford University Press.
  • Kivy, Peter. 1980. The Corded Shell: Reflections on Musical Expression. Princeton: Princeton University Press.
  • Kivy, Peter. 1989. Sound Sentiment: An Essay on the Musical Emotions. Philadelphia: Temple University Press.
  • Kivy, Peter. 1990. Music Alone: Philosophical Reflections on the Purely Musical Experience. Ithaca: Cornell University Press.
  • Kivy, Peter. 2002. Introduction to a Philosophy of Music. Oxford: Clarendon Press.
  • Kivy, Peter. 2009. Antithetical Arts: On the Ancient Quarrel Between Literature and Music. Oxford: Clarendon Press.
  • Lamarque, Peter. 2000. “The British Journal of Aesthetics: Forty Years On.” British Journal of Aesthetics 40, no. 1: 1–20.
  • Landerer, Christoph and Nick Zangwill. 2016. “Contemplating Musical Essence.” Journal of the Royal Musical Association 141, no. 2: 483–94.
  • Landerer, Christoph and Nick Zangwill. 2017. “Hanslick’s Deleted Ending.“ British Journal of Aesthetics 57, no. 1: 85–95.
  • Lang, Paul Henry. 1941. Music in Western Civilization. New York: Dent & Sons.
  • Larkin, David. 2013. “Battle Rejoined: Hanslick and the Symphonic Poem in the 1890s.” In Grimes, Donovan, and Marx 2013, 289–310.
  • Levinson, Jerrold. 1982. “Music and Negative Emotion.” Pacific Philosophical Quarterly 63: 327–46.
  • Levinson, Jerrold, ed. 2003. The Oxford Handbook of Aesthetics. Oxford: Oxford University Press.
  • Lippman, Edward A. 1992. A History of Western Musical Aesthetics. Lincoln: University of Nebraska Press.
  • Madell, Geoffrey. 2002. Philosophy, Music, and Emotion. Edinburgh: Edinburgh University Press.
  • Matravers, Derek. 1991. “Art and the Feelings and Emotions.” British Journal of Aesthetics 31, no. 4: 322–31.
  • Matravers, Derek. 1998. Art and Emotion. Oxford: Clarendon Press.
  • Maus, Fred Everett. 1992. “Hanslick’s Animism.” Journal of Musicology 10, no. 3: 273–92.
  • McColl, Sandra. 1995. “To Bury Hanslick or to Praise Him? The Obituaries of August 1904.” Musicology Australia 18, no. 1: 39–51.
  • Mothersill, Mary. 1984. Beauty Restored. Oxford: Clarendon Press.
  • Nattiez, Jean-Jacques. 1990. Music and Discourse: Toward a Semiology of Music. Translated by Carolyn Abbate. Princeton: Princeton University Press.
  • Paddison, Max. 2002. “Music as Ideal: The Aesthetics of Autonomy.” In The Cambridge History of Nineteenth-Century Music, edited by Jim Samson, 318–42. Cambridge: Cambridge University Press.
  • Paddison, Max. 2010. “Mimesis and the Aesthetics of Musical Expression.” Music Analysis 29, no. 1–3: 126–48.
  • Payzant, Geoffrey. 1985. “Eduard Hanslick’s Vom Musikalisch-Schönen: A pre-publication excerpt.” The Music Review 46: 179–85.
  • Payzant, Geoffrey. 1989. “Eduard Hanslick and Bernhard Gutt.” The Music Review 50: 124–33.
  • Payzant, Geoffrey. 1991. Eduard Hanslick and Ritter Berlioz in Prague: A Documentary Narrative. Calgary: University of Calgary Press.
  • Payzant, Geoffrey. 2002. Hanslick on the Musically Beautiful: Sixteen Lectures on the Musical Aesthetics of Eduard Hanslick. Christchurch: Cybereditions.
  • Pederson, Sanna. 1996. “Romantic Music Under Siege in 1848.” In Music Theory in the Age of Romanticism, edited by Ian Bent, 57–74. Cambridge: Cambridge University Press.
  • Pederson, Sanna. 2014. “Romanticism/Anti-Romanticism.” In Downes 2014, 170–87.
  • Prange, Martine. 2011. “Was Nietzsche Ever a True Wagnerian? Nietzsche’s Late Turn to and Early Doubt About Richard Wagner.” Nietzsche Studien 40: 43–71.
  • Pryer, Anthony. 2013. “Hanslick, Legal Processes, and Scientific Methodologies: How Not to Construct an Ontology of Music.” In Grimes, Donovan, and Marx 2013, 52–69.
  • Ridley, Aaron. 1995. Music, Value, and the Passions. Ithaca: Cornell University Press.
  • Radford, Colin. 1991. “Muddy Waters.” Journal of Aesthetics and Art Criticism 49, no. 3: 247–52.
  • Robinson, Jenefer. 2005. Deeper Than Reason: Emotion and Its Role in Literature, Music, and Art. Oxford: Clarendon Press.
  • Roholt, Tiger C. 2017. “On the Divide: Analytic and Continental Philosophy of Music.” Journal of Aesthetics and Art Criticism 75, no. 1: 49–58.
  • Rosengard Subotnik, Rose. 1991. Developing Variations: Style and Ideology in Western Music. Minneapolis: University of Minnesota Press.
  • Scruton, Roger. 1997. The Aesthetics of Music. Oxford: Clarendon Press.
  • Sharpe, R. A. 2004. Philosophy of Music: An Introduction. Chesam: Acumen.
  • Sizer, Laura. 2007. “Moods in the Music and the Man: A Response to Kivy and Carroll.” Journal of Aesthetics and Art Criticism 65, no. 3: 307–12.
  • Small, Christopher. 1998. Musicking: The Meanings of Performing and Listening. Hanover, NH: Wesleyan University Press.
  • Sousa, Tiago. 2017. “Was Hanslick a Closet Schopenhauerian?” British Journal of Aesthetics 57, no. 2: 211–29.
  • Stecker, Robert. 2003. “Definition of Art.” In Levinson 2003, 136–54.
  • Szabados, Béla. 2006. “Wittgenstein and Musical Formalism.” Philosophy 81: 649–58.
  • Tatarkiewicz, Władysław. 1973. “Form in the History of Aesthetics.” Dictionary of the History of Ideas 2: 216–25.
  • Titus, Barbara. 2008. “The Quest for Spiritualized Form: (Re)positioning Eduard Hanslick.” Acta Musicologica 80, no. 1: 67–98.
  • Trivedi, Saam. 2011. “Resemblance Theories.” In Gracyk and Kania 2011, 223–32.
  • Yanal, Robert J. 2006. “Hanslick’s Third Thesis.” British Journal of Aesthetics 46, no. 3: 259–66.
  • Yoshida, Hiroshi. 2001. “Eduard Hanslick and the Idea of ‘Public’ in Musical Culture: Towards a Socio-Political Context of Formalistic Aesthetics.” International Review of the Aesthetics and Sociology of Music 32, no. 2: 179–99.
  • Zangwill, Nick. 2004. “Against Emotion: Hanslick Was Right About Music.” British Journal of Aesthetics 44, no. 1: 29–43.

Research for this article was supported financially by the Austrian Science Fund (FWF, project number P30554-G30).

Author Information

Alexander Wilfing
Email: alexander.wilfing@oeaw.ac.at
Austrian Academy of Sciences
Austria

and

Christoph Landerer
Email: chlanderer@gmail.com
Austria

The Semantic Theory of Truth

The semantic theory of truth (STT, hereafter) was developed by Alfred Tarski in the 1930s. The theory has two separate, although interconnected, aspects. First, it is a formal mathematical theory of truth as a central concept of model theory, one of the most important branches of mathematical logic. Second, it is also a philosophical doctrine which elaborates the notion of truth investigated by philosophers since antiquity. In this respect, STT is one of the most influential ideas in contemporary analytic philosophy. This article discusses both aspects.

The STT is designed to define truth without circularity and to satisfy certain minimal conditions that must be met by any adequate theory of truth.

STT as a formal construction is explicated via set theory and the concept of satisfaction. The prevailing philosophical interpretation of STT considers it to be a version of the correspondence theory of truth that goes back to Aristotle. This theory is presented here in its modern shape, that is, as associated with first-order logic. Tarski’s original account used the elementary theory of classes (a theory similar to the simple theory of types).

One of Tarski’s most important results was to show that a theory of truth for set theory cannot be given within set theory itself, and that any truth definition for a formal language L must be given in a language which is essentially stronger than L.

Table of Contents

  1. Historical Introduction
  2. Outline of STT
  3. Informal Presentation of STT
  4. Formal Presentation of STT
  5. Philosophical Comments
  6. Final Remarks
  7. References and Further Reading

1. Historical Introduction

Alfred Tarski (1901–1983) was a Polish mathematician, logician and philosopher. He lived in the U.S.A. from 1939 onward and became an American citizen in 1945. He was a member of the Polish Mathematical School, the Warsaw School of Logic and the Lvov-Warsaw Philosophical School. These schools flourished in the interwar period (1918-1939).

While investigating problems associated with the definability of real numbers, Tarski came to the conclusion that the concept of satisfaction informally used in mathematics can help in defining the concept of truth. In 1930, he delivered two lectures (one in Warsaw. the second in Lvov) devoted to the concept of truth. In 1931, he began to work on a monograph on this topic. It was published in 1933 (see Tarski 1933) as Pojęcie prawdy w językach nauk dedukcyjnych (The Concept of Truth in Languages of Deductive Sciences). This book was well-received in Poland.

Due to Tarski’s contacts with the Vienna Circle, his semantic ideas became known abroad. The German translation (Der Wahrheitsbegriff in den formalisierten Sprachen) of Tarski’s Polish book appeared in 1935 (see Tarski 1935). In the same year, Tarski lectured at the Paris Congress for Scientific Philosophy; his lectures on the foundations and semantics and on the concept of logical consequence were applauded; (see Tarski 1936 and Tarski 1936a). His popular paper on the concept of truth appeared in Philosophy and Phenomenological Research in 1944 (see Tarski 1944). The English translation based on the German version of the book on truth (see Tarski 1956a) was included in Tarski’s famous collection Logic, Semantics, Metamathematics (1956). The last Tarski’s essay on truth (rather more popular than formal), namely “Truth and Proof”, was published in 1969 (see Tarski 1969). Since all Tarski’s writings on truth present principally the same ideas, this article does not refer to his particular works, except in some places.

2. Outline of STT

The Semantic theory of Truth (STT) has many ingredients. The most important are as follows:

  • (A) Truth as a property of sentences;
  • (B) Relations between truth and meaning;
  • (C) Diagnosis of semantic paradoxes;
  • (D) Resolution of semantic paradoxes;
  • (E) Relativization to languages;
  • (F) T-scheme (A is true if and only if A);
  • (G) The principle BI of bivalence;
  • (H) Material and formal adequacy of a truth-definition;
  • (I) Conditions imposed on a metalanguage in order to obtain a proper  truth-definition;
  • (J) The relation between language and metalanguage;
  • (K) The truth-definition itself;
  • (L) Maximality of the set of truths in a given language;
  • (M) The undefinability theorem.

These points are gradually elaborated in the next remarks, with capital letters referring back to the above list.

(A)–(B). For Tarski, sentences are truth-bearers. However, sentences are always equipped with meanings. Tarski avoided explaining what the meaning of an expression is. On the other hand, he explicitly said that the problem of defining truth is meaningless for purely informal languages. Roughly speaking, the semantic truth-definition (SDT, for brevity) is formulated for formalized languages.

(C)–(D). The Liar Paradox is a serious problem for any truth-definition. The ancient version attributed to Epimenides runs as follows. A Cretan says “I am lying now”. If he is actually lying, his sentence is true, but if he is not lying, the sentence in question is false.  Contradiction! For the modern version, consider the sentence

(S) The sentence denoted by (S) is false.

Observe that (S) = ((S) is false). Since, (S) and ‘(S) is true’ are equivalent, we obtain a contradiction expressed by

(LP) (S) is true if and only if (S) false.

What are sources of the Liar Paradox (LP)? First, it employs the sentence (S) which asserts its own falsity. Such a situation is called a self-referential use of a semantic concept; the semantic concept in this case is falsehood. Second, the Paradox uses a rule that a sentence, let us say A, is true if and only if A (which Tarski called the T-scheme). Third, we apply, classical logic, in particular, the law of bivalence, that is, (BI).

This diagnosis, which was proposed by Stanisław Leśniewski (Tarski’s teacher in Warsaw) and adopted by Tarski, offers three ways out of the Paradox. First, one could eliminate self-referentiality from the language. Second, reject the T-scheme. Third, change logic, in particular, reject (BI). The third strategy is popular in the twenty-first century, and it uses the techniques of many-valued logic, logic with truth-value gaps, or paraconsistent logic. These solutions will not be commented upon in this article. Anyway, Tarski considered them to be too complex and too narrow because they require the rejection of what should be retained. The T-scheme, according to him, is so intuitive that it cannot be rejected. Thus, the proper solution is to eliminate self-referentiality, he said.

(E)–(F). How to eliminate self-referentiality? The main idea is that the concept of truth should be relativized to a language. More specifically, we deal with the context ‘the sentence A is true in a language L’. However, this move is still insufficient, because if self-referentiality is to be banished, the adjective ‘true’ must belong to another language. This new language is called the metalanguage and is abbreviated by the symbol ML (we assume that L is a corresponding language). The simplest and the most popular situation is that L is an object-language (used to speak about the world) and ML forms its metalanguage, suitable for speaking about L. Here is an example. Assume that German is our object-language, but English serves as the associate metalanguage. We write in L ‘Schnee ist weiss’, but in ML we write ‘The German sentence “Schnee ist weiss” means that snow is white’. We see that ML must contain resources for speaking about expressions belonging to L. In order to indicate that we are speaking about L-expressions, we use quotation marks, but many other devices can be employed. For instance, we can use italics and write that the sentence Schnee is weiss means that snow is white. The most important observation is that expressions like ‘Schnee ist weiss’ and Schnee ist weiss are (metalinguistic) names in ML of the corresponding German sentence that is in L. The standard way of capturing the reported distinction is to say that expressions are used in L, but mentioned in ML.

The above conventions function as the part of STT. A simple example is

(1) ‘Schnee ist weiss’ in German is true if and only if snow is white.

The interaction of two languages in (1) consists in the fact that the name of the german sentence is on left, and its English translation is on the right. If the same language functions as both L and ML, one should speak about self-translation. According to the foregoing explanations we can generalize (1) into

(TS) ‘A’ is true in L if and only A*,

where the symbol A* refers to a translation of the sentences denoted by the name ‘A’. It is the general form of the T-scheme. (For additional discussion of the T-scheme, see the Liar Paradox.) Note that we cannot replace (TS) by

(2) For any A, ‘A’ is true if and only if A,

because the letter A is not free in the expression ‘A’. Quotes can be regarded as a name-forming operator. Anyway, concrete biconditionals (T-sentences, T-equivalences) arising from (TS) play the crucial role in STT. Roughly speaking, they capture the following intuition: a sentence saying so and so is true if so and so.

All explanations given above are formulated in ordinary English. It is easy to see several inconveniences of this approach. For instance, we should multiply quotes, when we pass from using to mentioning, for instance to write ‘‘A’’, when ‘A’ is mentioned. To simplify the issue, we replace some occurrences of quotes by such expressions as ‘name’, ‘sentence’, and so forth. Also, the concept of translation as applied to ordinary languages is not precise. The most important thing is that ordinary languages contain their own metalanguages, that is they are (to use Tarski’s way of speaking), semantically closed. This circumstance causes semantic paradoxes; the Liar is only one of them, but we will not consider others.

Tarski was very sceptical about the possibility of successfully providing a coherent truth-definition for ordinary language. Hence, he worked with a formal language. Such a language must have a well-defined alphabet (the set of elementary expressions), a well-defined set of formulas and a logical basis. If L is a formalized language, its ML is only partially formal, usually a part of ordinary mathematics. The following example illustrates the issue. Let ‘P(a)’ be the considered formula. It is an atomic formula of first-order language and says that a is P (the object a has a property P). The truth conditions of this sentence should be formulated by

(3) ‘P(a)’ is true if and only if a is a member of the set P,

where the non-italics letter P refers to the set that is denoted by the italicized predicate letter P. When (3) is expressed more formally in set theory, the binary relation “is a member of” is usually represented by the Greek letter epsilon, namely \in. In this example, the language of set theory serves as the metalanguage ML. To finish this part, note that Tarski liberalized his early negative attitude to ordinary speech. In his later works, he introduced the concept of languages having specified structure (see Tarski 1944). They are not semantically closed formalized languages, but are well-described by specification of their units, complex expression and the underlying logic.

3. Informal Presentation of STT

 

As noted earlier, Tarski considered the concept of satisfaction (more precisely, the satisfaction relation) as basic for defining truth. In particular, truth is to be defined as a special case of satisfaction. Assume that L is given – it is a first-order formal language. Open formulas are defined as containing free variables. By contrast, closed formulas have no free variables – for instance, P(a) or \exists xPx. Open formulas are satisfied or not, depending upon how the free variables are interpreted in a given domain D, but sentences are true or false. Take the formula ‘x is a city’. Let D consist of cities and rivers. Our formula is satisfied by London, but not by Thames (we assume that the name ‘Thames’ refers to the river Thames). Furthermore, the sentence ‘London is a city’ is true in D, but the sentence ‘Thames is a city’ is false in D. Roughly speaking, satisfaction converts open formulas into true sentences, but non-satisfaction into false ones. Moreover, these considerations show that an instance of the T-scheme, namely the equivalence ‘the sentence ‘London is a city’ is true if and only if London is a city’ correctly displays the main ordinary intuition associated with the predicate ‘is true’.

The above explanations do not provide a definition of truth. Consider now two collections of ideas:

(A) (General case): open formulas,
satisfaction by some objects from D;
non-satisfaction by some objects from D;

(Special case): closed formulas (sentences), satisfaction by ?;
non-satisfaction by ?

Inspecting the formulas ‘x is a city’ and ‘London is a city’ leads to the conclusion that although satisfaction depends on valuation (valuation given by a valuation function consists in attributing denotations from D to expressions of L) of free variables, truth and falsehood do not. The reason is very simple and even trivial, namely that sentences have no free variables. Consequently, truth and falsehood should (even must) be independent of how the valuation function acts with respect to terms that are free variables. On the other hand, logical values are determined by valuations of constants (individual names, such as ‘London’) and predicates (such as ‘is a city’) as well as by the understanding of logical constants (propositional connectives, quantifiers and identity).

The last observation motivates the following formulation of SDT assuming that the domain of interpretation D is fixed:

(3) (a) ‘A’ is true if and only if ‘A’ is satisfied by any object in D;

(b) ‘A’ is false if and only if ‘A’ is satisfied by no object in D.

Using ‘London is a city’ as an example we have that this sentence is true if and only if it is satisfied by any object from D (this formulation will be commented upon below). Now, (A) can be corrected by dropping question-marks as

(B) Open formulas: satisfaction by some objects from D, but not others;

sentences: satisfaction by all objects from D (truth);

open formulas: non-satisfaction by some objects from D;

sentences: satisfaction by no objects from D (falsity).

The formal version of (B) is formulated in the next section.

The definition of sentences as open formulas without free variables looks at first sight like an artificial mathematical trick, but such constructions frequently occur in mathematical practice as useful simplifications. For example, the straight line can be considered as a special case of a curve, or Euclidean space as a special instance of Riemannian space, and so forth. Consequently, (B) can be charged with being a result of a purely formal game, completely alien to ordinary and philosophical intuitions. Tarski did not conceal that his explanations pertaining to truth employ mathematical concepts and techniques that are perhaps fairly obvious for practising mathematicians, but that are not convincing as tools of a reasonable philosophical analysis. This article does not do that. However, one can also try to argue that this definition fulfills some intuitive constraints. For instance, it entails that no sentence is true and false at the same time (the metalogical principle of contradiction). On the other hand, if A is an open formula, it is not the case that either A is satisfied or \negA is satisfied. The formulas P(x) and \negP(x) can serve as an example – both can be satisfied, for instance, ‘x is a city’ and ‘x is not a city’ can be satisfied though not by the same city. This example shows that generally speaking satisfaction of open formulas has some other properties than truth attributed to sentences, although, both concepts are related in many ways. By definition, every sentence is satisfied by all objects or by no object. Assume that the formula \forallxP(x) is true and, thereby, satisfied by every object. Its negation, the formula \existsx\negP(x), is satisfied by no object. This assertion implies the metalogical principle of the excluded middle. Thus, we reach (BI) (the principle of bivalence).

Let us try to come up with a philosophical paraphrase of the statement that if truth and falsehood are independent of valuations of free variables, then having logical values by sentences depends on how things are in considered universes, in our example, in D. It is time to introduce (informally, but it suffices) the concept of model. Models are algebraic structures consisting of a universe U (that is, a set of objects; some items can be distinguished and named by special names – individual constants) and relations, defined on U (other elements of model are omitted). If X is a set of sentences and M is its model, then all sentences belonging to X are true in M. Perhaps we could say that if truth and falsehood are indeed free of such valuations, then whether sentences have definite logical values is how things are in a relevant model.

Two additional remarks are in order. First, satisfaction by all objects cannot be regarded as equivalent to being a logical tautology. Satisfaction is always relative to a chosen (fixed) universe. In particular, all conclusions made in this section assume that the stock of predicates – such as ‘is a city’ is established in advance and its elements have a definite meaning that stems from a specific interpretation. If A is a logical tautology this means that A is true (now in the outlined sense) in all models Second, truth and falsehood relativizes truth (and falsehood) not only to L, but also to M. To sum up, SDT considers truth as relativized to an interpretation of L via M. In fact, SDT defines the set of true sentences in a given L. This literally means that the definition in question is extensional, that is, determines the scope of the predicate ‘is true’. However, taking into account that every definition of a given set X as a reference of a predicate P, directly or indirectly, deals with the content of P, SDT offers an understanding of the property expressed by P.

To be satisfactory SDT must conform to so-called conditions of adequacy. More specifically, this definition must be (a) formally correct, and (b) materially correct Condition (a) means that the definition does not lead to paradoxes and it is not circular. These requirements involve the interplay of L and ML functioning as insurance against semantic inconsistencies. Moreover, SDT does not appeal to the concept of truth for ML. Condition (b) is formulated as the Convention T (CT, for brevity) stating that (a) a formally correct truth-definition should logically entail all instances of T-scheme available in L; (b) Tr \subseteqL (the set of true sentences of L is a subset of the entire L). CT shows that the T-scheme is not a required T-definition. On the other hand, Tarski underlined that every particular T-sentence provides a partial definition of truth for a given sentence. One could possibly form the conjunction of all T-equivalences as the definition, but this formula would to be infinite in length (thus, this maneuver is limited to finite languages). Moreover, the T-scheme does not imply (BI).

A standard objection against STT points out that it stratified the concept of truth. It is because we have the entire hierarchy of languages Lo (the object language), L1 ( = MLo), L2 (= ML1), L3 (M L2), …. Denote this hierarchy by the symbol HL. It is infinite and, moreover, there is no universal metalanguage allowing a truth-definition for the entire HL. Such a language would be semantically closed and, thereby, inconsistent. STT generates the hierarchy ‘truth in L0’, ‘truth in L1’ ‘truth in L2’, …, contrary to the ordinary use of ‘is true’ which is not stratified. Thus, SDT must be separately performed at every level of HL. Two observations are in order in this context. Firstly, we have that Tr(Ln) \subset Tr(Ln+1), for every n, due to the fact that every Ln is translated into its metalanguage Ln+1. Consequently, HL is cumulative, that is, Tr(Ln+1) includes all truths of Ln. Secondly, taking first-order logic as the foundation and the Hilbert thesis (every theory can be formalized in the first-order language), we define ‘true in the first order L’ in ML. This second language is partially informal. In fact, SDT for first-order languages requires tools from weak-second order logic (but it is too formal issue to be explained in this survey). Thus, the stratification objection (originally formulated for Tarski’s construction via a simple theory of types) can be easily discarded and we can stay with one concept of truth. The price is that the concept of truth cannot be used for sentences formulated in ML.

4. Formal Presentation of STT

The earlier explanations concerned the simplest case, namely satisfaction of monadic open formulas, that is, of the form P(x). What about the formula (a) ‘x is a larger city than y’, which expresses the relation of being a larger city? We can say that the sequence <London, Manchester> satisfies (a), but not the sequence <Manchester, London>. (This article assumes the reader knows logical notations and elementary set-theoretical concepts, particularly the concept of sequence.) Since formulas can have arbitrary length, we need a generalization of this procedure in order to have a uniform way of dealing with all cases. This was Tarski’s motivation for introducing the concept of satisfaction by means of infinite sequences of objects. Since formulas are of arbitrary but always finite length, infinite sequences have a sufficient number of members to cover the satisfaction of all possible cases of particular formulas. This intuition is articulated by

(4) A is satisfied by an infinite sequence s = <s1, s2, s3,…>, where sn (n \geq1) refers to the nth term of s.

The definition of satisfaction (SAT; the symbol I refers to an interpretation) is as follows (This article simplifies indexing, and it restricts terms to individual variables and individual constants; the knowledge of this logical notation is assumed):

(5) (a) ‘Pj (t1, …., tk )’ \in SAT(s, I) ⇔ <ℑ (‘t1’), …, ℑ(‘tk’)> \in Rj (=I(‘Pj’);

(b) ‘\negA\in SAT(s, I) ⇔ ‘A\not \in SAT(s, I);

(c) ‘A \wedge B\in SAT(s, I) ⇔ ‘A\in SAT(s, I) and ‘B\in SAT(s,I);

(d) ‘A \vee B\in SAT(s, I) ⇔ ‘A\in SAT(s, I) or ‘B\in SAT(s, I);

(e) ‘A B\in SAT(s, I) ⇔ ‘\negA\in SAT(s, I) ‘B\in SAT(s, I);

(f) ‘A B\in SAT(s, I) ⇔ ‘AB\in SAT(s, I) and ‘BA\in SAT(s, I);

(g) ‘\forallxiA(xi)’ \in SAT(s, I) ⇔ ‘A(xi)’ \in SAT(s’, I), for every sequence s’, which differs from the sequence s at most at the ith place;

(h) ‘\existsxiA(xi)’ \in SAT(s, I) ⇔ ‘A(xi)’ \in SAT(s’, I), for some sequence s’, which  differs from the sequence s at most at the ith place.

The first clause establishes the satisfaction-conditions for atomic formulas that refer to relations (sets can be considered as one-placed relation). Conditions (b)–(f) repeat the semantic definitions of propositional connectives, (g) and (h) concern quantifiers and say that an (open) universal formula is satisfied by every sequence, but an existential formula by some sequence (‘differs at most at most ith place’ is a technical phrase to capture the intended meaning). The reference to an interpretation ℑ indicates its role in correlation of expressions and their references, for instance predicates and relations. Since I is always associated with a model M, the expression ‘A\in SAT(s, I) can be replaced by the phrase ‘A\in SAT(s, M) (a formula A is satisfied by a sequence s in a model M). If s is an infinite sequence and A has n free variables, only n terms of s are relevant to A’s being satisfied or not. Another formal possibility to define the satisfaction relation consists in introducing sequences of a sufficient finite length.

What about sentences? Consider the example with London and Manchester. The formula (*) ‘x1 is a larger city than x2’ is satisfied by every ordered pair <s1, s2> such that s1 = I(x1) and s2 = I(x1) are cities, and s1 is larger than s2. In particular, the pair <London, Manchester> satisfies (3). Note that the sequence <s1, s2> can be enlarged by adding an arbitrary number of terms in order to have an infinite sequence <s1, s2, s3, …, sk, …>, but this operation is irrelevant to satisfaction or lack thereof. Informally speaking, if a sequence <s1, s2> satisfies (or not) the formula (*), the same applies to the sequence <s1, s2, s3, …, sk, …>, because the terms s1, s2 are the only one that are significant for the satisfaction business in question. Now substitute Manchester. That gives (**) ‘x1 is a larger city than Chicago’. This formula is satisfied by the sequence < s1> such that s1 = I(x1), is a city and s1 being larger than Chicago, in particular by the object <London>. Enlarging the sequence <London> by adding an arbitrary number of terms does not change the situation. Every sequence of the form <London, s2, s3, …, sk, …> satisfies the formula (**). Finally, consider (***) ‘London is a larger city than Manchester’, which is just a sentence, not an open formula. Since it has no free variables, its satisfaction does not depend on valuations of free variables. Hence, every infinite sequence of the form <s1, s2, s3, …, sk, …> satisfies (***). In other words, we can replace sk by an arbitrary object and this step has no relevance for the satisfaction of (***). It is satisfied, because London is a larger city than Manchester. Another way to the same result consists in using a theorem of first-order logic ‘if A is a sentence, \forallxi A A. Assume that a sequence s satisfies (***). By clause (5g), formula A is also satisfied by every sequence s’ which differs from s at most at the ith place. Since A has no free variables, the ith place can be arbitrarily chosen from terms of s’. This means, that every sequence satisfies A. This reasoning implies that if a sentence A is satisfied by at least one sequence, it is also satisfied by any other sequence. Conversely, if a sentence is not satisfied by at least one infinite sequence, it is also not satisfied by any other infinite sequence.

Accordingly, the following statements are obtained

(6) A sentence is satisfied by all sequences if and only if it is satisfied by at least one sequence.

(7) A sentence is not satisfied by all sequences if and only if it is satisfied by no sequence.

Both assertions lead to

(8) If A is a sentence it is satisfied by all sequences or is satisfied by no sequence.

(6) and (7) lead to the following definition:

(SDT) (a) ‘A’ is true in M if and only if ‘A’ is satisfied by every infinite sequence of objects M (equivalently: by at least one such sequence);

(b) ‘A’ is false in M if and only if ‘A’ is not satisfied by some infinite sequence of objects from M (or by no sequence).

However, we can also prove that if a sentence is satisfied by any infinite sequence of objects (or by one such sequence), it is also satisfied by the empty sequence of objects. Thus, SDT can also be formulated by saying that the sentence A is true if and only if it is satisfied by the empty sequence of objects (the notion of the empty sentence is a generalization of the usual definition of sequence. This definition is model-theoretic and explicitly appeared in (Tarski, Vaught 1957). Tarski’s original treatment assumed that satisfaction and truth refer to the one domain in which expressions are interpreted. One can eventually say that the concept of model was implicitly involved in Tarski 1933.

Let us look at the consequences of SDT in the above formulation. Since it assumes resources to meet (LP) and similar paradoxes, its consistency against semantic antinomies is guaranteed. Since SDT does not use the concept of truth, it is not circular. On the other hand, we must suppose that out metatheory (weak second-order arithmetic) is correct in an intuitive sense. According to Tarski, SDT is formulated in the morphology (syntax) of ML. Due to the understanding of logic around 1930, it covered set theory or the theory of logical types. Thus, Tarski was justified in his view that the correctness of metatheory is reduced to that of pure logic.

Today, the situation is more complicated. One can say that SDT proceeds as a typical mathematical construction based on a portion of set theory. Although some philosophers – for instance, Husserl and his followers – will probably be dissatisfied by this situation vis-a-vis their claim that philosophical constructions have to be free of presuppositions, the defenders of SDT (and similar constructions) can reply that (a) conformity to mathematical practice is more important than established a priori metaphilosophical postulates, and that (b) an informal understanding of ML is inevitable for logical constructions pertaining to L. Since ML exceeds L in expressive means, we have also a good articulation of the claim that ML must be richer than L in order for truth for the latter to be defined in the latter. SDT satisfies CT and implies (BI).

The set Tr(L) has various metamathematical properties. It is consistent, forms a deductive system, which is maximal (no sentence can be added without losing consistency), compact (Tr(L) is consistent if and only if its every finite subset is consistent) and syntactically complete (for any A, A \in Tr(L) or \negA \in Tr(L). On the other hand, sets of truths are not always finitarily axiomatizable, In other words, it is not so that for any Tr(L), there exists a finite set X \subset Tr(L), such that Cn(X) = Tr(L) (the symbol Cn refers to the consequence operation). SDT leads to a very elegant account of logical consequence (see Tarski 1936a). We say that the sentence A belong to the set of consequences of the set X if and only if every model of X is also a model of A. In symbols, A \in CnX if and only for every M, if M is a model of X (every sentence from X is true in M), then A is true in M.

STT, claiming that ‘is true in L’ is defined in ML, raises the question whether we can define truth inside L. The Tarski Undefinability Theorem (TUT) says that if a consistent theory T contains the arithmetic of natural numbers, the set of T-truths is not definable in T. In other words, the truth-predicate is not definable in languages sufficiently rich for expressing the arithmetic of natural numbers. So, TUT is a limitative theorem. Gödel’s first incompleteness theorem (GFT) is perhaps the most famous example of a limitative theorem. If states that if AR (the formal arithmetic of natural numbers) is consistent, it is also incomplete, that is, there are arithmetical sentences A and \negA, such that they are not provable in AR.

The informal proof of GFT proceeds in the following way. Consider the sentence (i) ‘the sentence (i) is not provable’. If (i) is true, it is unprovable, but if it is false, it is unprovable as well, because logic cannot lead to false consequences (we tacitly assume that axioms of AR are true). Using the law of excluded middle, we obtain that there exists a true but unprovable theorem.

The above reasoning is semantic. The formal proof of GFT is purely syntactic and uses arithmetization that is, translation of metamathematical concepts and theorems into the language of AR.

Assume that STTL is a correct (consistent) truth-theory for L formulated in this language and that a formula A \in L mentions itself and says ‘A does not define truth’. If A \in Tr(L), truth is undefinable by A. Now, A is not a theorem of STTL, that is \neg(STTLA) (or A \in Cn(STTL). This assertion is justified by the reductio argument. Assume that STTLA. Hence, (\negA \not \in Cn( STTL). Hence, \negA can be either false or independent of STTL. The first-case is impossible, because it would mean that STTL defines truth for L. Thus, the second possibility remains, namely that STTL does not define truth for L. Assume that A is false. This means that STTL defines truth of L. However, it is impossible, because A would be a false theorem of STTL, but we assumed that this theory is materially correct and so contains not falsehoods. Thus, we proved that STTL does not define the truth- predicate for L (the informal version of Tarski’ undefinability theorem (TUT)). A more technical version of this theorem says that there is no formula Tr(A) \in LAR such that for any A \in LAR, AR ├ A ⇔ Tr(‘A’). The proof of TUT in this formulation uses the fixed-point lemma (FPL), which says that if A(x) \in LAR and A(x) has one free variable, then \existsB \in LAR (AR ├ B A(‘B’). The proof is remarkably brief. Assume that there is a formula mentioned in the first part of (TUT). By (FPL), there exists a sentence A such that AR ├ A \negT(‘L’). By our assumption, we obtain AR ├ T(L) ⇔ \negT(L), but it conflicts with consistency of AR.

Formulations and proofs of GFT and TUT essentially appeal to self-referentiality. However, the former theorem does not demonstrate that the sentence ‘I am not provable’ is paradoxical, but only that it is independent of AR. The situation in the context of TUT is radically different. In particular, the second part of the informal proof of this theorem shows that adding the formula A (in the indicated meaning) results in the contradiction. The formal proof TUT via FPL confirms this assertion. In fact, FPL can be considered as a metalogical (metamathematical) pointing out of what is wrong with the Liar Paradox. This outcome is important because shows that paradoxes related to self-reference are not curiosities but that they have deep connections with general mathematical results. Finally, one should see a fundamental difference between GFT and TUT. Although both have similar informal formulations appealing to the concept of truth, the forms can be replaced by its syntactic version, the latter not. In the language of recursion, the set of provable sentences of AR is not recursive (a set is recursive if and only if it is computable; it implies that the complement of recursive set is recursive as well), but recursively enumerable (a set is recursively enumerable provided that it can be enumerated by natural numbers; it does not implies that is, complement can be enumerated as well), but the set of arithmetical truths does not fulfils the condition of recursive enumerability. Thus, semantic cannot be reduced to syntax. This fact is particularly important in metamathematics, because doing formal semantics for theories sufficient for expressing AR require infinitistic methods, but syntax of such systems is finitary.

5. Philosophical Comments

Tarski explicitly asserted that he considered STT as an answer to one of the central problems of epistemology. This claim motivates several philosophical comments about the truth-theory. However, we enter here risky territory, because philosophy is full of conflicts and polemics. Limiting attention to analytic philosophy, STT has (had) radical critics such as Otto Neurath and Hilary Putnam, radical defenders such as Rudolf Carnap and Karl Popper, sceptics maintaining that it is philosophically sterile, and an army of more or less followers trying to improve or reinterpret it such as Donald Davison, Hartry Field, Paul Horwich and Saul Kripke. At least three important contemporary philosophers radically changed their views under Tarski’s influence, namely Kazimierz Ajdukiewicz (who rejected radical conventionalism), Carnap (who changed his early view that logical syntax is the core of philosophy and defended semantics as the foundation of philosophical analysis) and Popper (who adopted scientific realism as the most proper philosophy of science).

The above brief survey focused on positive as well as negative influences of Tarski’s ideas. Both indicate that STT is a contemporary philosophical tool, at least in the camp of analytic philosophy. (Continental philosophy is ignored here, although a longer treatment should also refer to this tradition.)

Without pretence to completeness, here are the problems which should be touched upon by any philosophically reasonable truth-theory in philosophy. Being philosophically reasonably does not mean correct, but rather deserving attention in the world of philosophy).

  1. What are the bearers of truth?
  2. What are the initial intuitions associated with a given truth-definition?
  3. How to define truth, and what about the consequences of SDT?
  4. Is the division of truth-bearers stable, that is, do at least some truth-bearers sometimes change their truth-values (briefly: is truth relative or absolute)?
  5. What is a truth-criterion and what is the relationship between truth-criteria and truth-definition?
  6. What is the relation of a particular truth-theory to its rivals?
  7. How can a given truth-theory be defended against various objections?
  8. What is the relation of truth to other philosophical problems?

So, there is much for a theory of truth to accomplish. This article tries to show how the STT of truth is related to these questions, or at least to some of them.

(1) STT assumes that truth-bearers are sentences in the syntactic sense. Yet there are several more concrete possibilities. Sentences? Propositions? Statements? Judgments? These entities can be either linguistic units or objects expressed by linguistic utterances. By contrast, concepts are not truth-bearers, contrary to what Hegelians say. To have a convenient label, we can say that, according to STT, entities qualified as true or false are of the propositional syntactic category. This way of speaking has nothing to do with the question of the ontological nature of propositions, for instance, as abstract objects. Tarski himself chose meaningful sentences as entities on which truth is predicated.

(2) Tarski always stressed that his definition follows the intuitions of Aristotle. Tarski was influenced by the Stagirite himself as well as his Polish teachers, particularly Tadeusz Kotarbiński. Tarski, like most Polish philosophers, uses the label ‘classical truth definition’ as referring to Aristotelian ideas. At the beginning, Tarski identified the classical and correspondence theory of truth, but later he expressed greater reservations with respect to explanations via expressions, such as “agreement” or “correspondence” than to Aristotle’s original formulation. It is not controversial that a T-equivalence says of a true sentence that it states how things are.

What about SDT? We have two options, first, having some justifications in Tarski’s explanations that satisfaction by all sequences of objects is a mathematical trick, and, second, that the official definition corresponds to some ordinary intuitions. The second option is based on some facts, for instance, that SDT entails T-sentences and  BI. Anyway, SDT suggests that truth depends on the domain (model) and how it is. This definition does not appeal to terms such as ‘agreement’ (of a truth-bearer and the world, fact, state of affairs, and so forth.), ‘picturing of the world by minds, thought, and so forth.’, ‘structural similarity’, and so forth. One can propose to distinguish the strong correspondence theory, as in the famous formulation veritas est adequatio rei et intellectus, and the weak (semantic) correspondence. Presumably STT might be interpreted as a weak correspondence theory.

(3) Tarski decided to define truth by a single formula (the definition satisfaction is recursive). He considered introducing truth by axioms, but he rejected this possibility for philosophical reasons. More specifically, he was afraid of being criticized by philosophers from the Vienna Circle for advocating physicalism (see Tarski 1936). This motivation is presently completely historical. Today, the axiomatization of the concept of truth is commonly applied.

TUT has some intriguing consequences for philosophy. Assume what is natural and philosophically tempting, namely that the collection TRUTH of all truths is infinite. By TUT, TRUTH is not definable by resources conceptually available within it. The only admissible way out within set theory consists in considering TRUTH to be too big a set (Zermelo-Fraenkel system), a class as distinct from sets (Bernays-Gödel-von Neumann) or a category. All these outcomes are formally correct, but lead to not quite pleasant consequences, at least for philosophers who like to say something about the set of all truths. However, set theory and TUT seriously limit such theoretical ambitions. On the other hand, this fact gives a precise meaning for the assertion that truth is transcendental in the sense of the medieval theory of transcendentalia (verum omnia genera transcendit).

(4) The classical concept of truth is commonly considered as absolute, that is, if A is true then it is true eternally (for ever) and sempiternally (since ever). On the other hand, SDT indexes truth by L and M. Does this deprive truth of its absolute character? This question is connected with such issues as bivalence, logical determinism and many-valued logic. Without entering into details concerning this fairly complex stock of ideas, it might be suggested that one can model-theoretically prove that truth is eternal if and only if it is sempiternal. Thus, the classical theory of truth in the semantic setting can be considered as associated with the absolute concept of truth. Even if this conclusion encounters reservations, the possibility of analysing the absolutism/relativism controversy within the philosophical theory of truth via SDT is a remarkable fact.

(5) Clearly, SDT is a-criterial. This means that the definition in question does not generate any truth-criterion, although it says what truth is. If mathematics is taken into account, proof can be regarded as a measure of truth. However, there arises a problem. Let the symbol Pr denote the provability operator. By the Löb theorem, we have PrAA, a theorem very similar to TrAA. But, due to the first incompleteness theorem, the formula A ⇒ PrA cannot be consistently added to the provability logic. Hence, there is no counterpart of the T-scheme with Pr instead Tr, that is, the scheme PrA A. So, we must conclude that proof is not a complete truth-criterion even in mathematics. This fact can motivate various ways out, for instance, modifying the concept of proof (every true mathematical assertion can be proved in a formal system; this assertion does not contradict the incompleteness theorem) or replacing truth by proof, eventually with additional constraints, for instance, that proofs must be constructive. However, such proposals are restricted to mathematics. Another suggestion is that truth-criteria consist of procedures which justify satisfaction of open formulas by some objects.

(6) Tarski grew up in the tradition of division of truth-theories into the classical theory and so-called non-classical theories, namely the evidence theory (A is true if A is evident), the coherence theory (A is true if it can be embedded in a coherent system without destroying its coherence), the common agreement theory (A is true if specialists agree about its correctness) and the utilitarian theory (A is true if A is useful). The non-classical theories are criteria, because they appeal to procedures assuring that something is true. Tarski himself mentioned the last definition and the coherence account. In general, he considered non-classical theories as lacking precision and he did not discuss them as serious alternatives for STT.

Another issue involving the relation between various truth-theories concerns substantial and minimalist accounts. The latter approach (the redundancy theory, the deflationary theory, and so forth.) reduces the truth-definition to the T-scheme. Under this view, STT is a minimalist theory. Tarski himself discussed this question. His counterexample was the sentence ‘All consequences of true sentences are true.’ It is not justified by the T-scheme, and it does not justify asserting that all consequences of true sentences are true. There are much more complicated examples, for instance, the sentence ‘There exist true but not provable sentences’, which looks not to be subject to a minimalist translation. If so, STT is essentially richer than any minimalist theory of truth.

(7) Consider three objections stated by Franz Brentano against the classical theory, and consider trying to show that STT meets them successfully. First, the concept of correspondence is obscure and cannot be satisfactorily explained. More precisely, in order to establish what a truth-bearer corresponds to in reality, one must compare the former with the latter. But it is impossible, due to relata of such a comparison. However, this objection applies to the strong notion of correspondence, not to its weak form. The second objection is more serious. Assume that we define truth by a definition D. Yet D is a sentence. In order to have a good definition D must be true. Now, the definition is either circular (if it uses itself) or falls into the regressum ad infinitum, because in order to formulate D, we must appeal to D’ related to D, and so forth. Third, the concept of correspondence does not explain the truth of negative sentences. The answers to these objections depend on the relation of L to ML. These relations do not entail that SDT is circular or leads to an infinite regress. The problem of negative sentences has a simple solution in STT because they are true (or false) under the same definition as positive ones.

(8) Tarski underlined that one can accept STT without being committed to strong ontological or epistemological views such as idealism or realism. In other words, STT is independent of such philosophical assumptions or consequences. Independently of Tarski’s intentions, it is easy to give an example of a philosophical problem closely related to STT, namely the semantic realism / semantic anti-realism debate. Generally speaking, (semantic) realists, such as Donald Davidson, use STT; but (semantic) anti-realists (such as Michael Dummett) reject this account of truth. This controversy concerns the mutual relation of the condition of truth and condition of assertibility. The realist says that the meaning of a sentence (MS) is given by its truth-conditions (TC), but the anti-realist says the meaning is given by assertibility-conditions (AC). Thus, we have two equalities:

(i) MS = TC (realism);

(ii) MS = AC (anti-realism).

However, (i) and (ii) are still too vague. In fact, (i) and (ii) should be transformed into

(iii) MS = TC \wedge TC ⇒ AC;

(iv) MS = AC \wedge TC = AC.

The antirealist says that truth-conditions exceed assertibility-conditions, but the antirealist identifies truth-conditions with the assertibility conditions. How does STT work here? It justifies (iii), but it refutes (iv). If, as many anti-realists claim, the conditions of assertibility are governed by intuitionistic logic, it does not generate sufficient and necessary conditions for asserting any mathematical sentence. The point is that the incompleteness theorem constructively holds for Heyting arithmetic (Peano arithmetic based on intuitionistic logic). If so, the anti-realist cannot say that there are true, but unprovable sentences; but the realist can by appealing to STT. As far as the issue concerns more general (that is, ontological and/or epistemological) forms of realism and anti-realism, some insights are provided by results about the full expressibility of semantics in syntax. The general philosophical problem considers the relation between the knowing subject and the object of knowledge. Following a modernized Ajdukiewicz’s proposal, the former is represented by syntax, that is, defines the subject inside language, but the latter can be identified with a model of this language. Since, due to TUT, models transcend languages or cannot be defined within them, the realists’ view on knowledge and reality, has some justification.

6. Final Remarks

STT employs logical tools throughout. Yet this theory is not a logical calculus in the sense in which propositional or predicate logic are. STT is metamathematical, and eventually axiomatic, if this approach is chosen. The status of T-equivalences provides a good illustration in this respect. They are neither logical tautologies nor material biconditionals. As consequences of SDT they have the status of mathematical theorems provable from axioms. This remark does not end the discussion about the character of T­-equivalences, but at least it outlines the direction which seems correct. Anyway, STT belongs to logic in a broad sense.

The philosophical content of STT plays an important role in philosophy of language, logic and mathematics, at least in clarifying some issues. On the other hand, the belief that STT can ultimately solve various problems of these parts of philosophy would be exaggerated. This statement even more concerns epistemology and ontology. On the other hand, as this article documents, although philosophical uses of the semantic theory of truth are problematic, Tarski’s semantic ideas are not philosophically sterile.

7. References and Further Reading

The readings below include only general books on Tarski and his basic writings. Further bibliographical references are available in the books mentioned.

  • Beeh, V., 2003, Die halbe Wahrheit. Tarskis Definition & Tarski’s Theorem, Paderborn, Mentis.
  • Butler, M. K. ,2017, Deflationism and Semantic Theories of Truth, Manchester, Pendlebury Press.
  • Casari, E.,2006, La matematica della verità. Strumenti matematici della semantica logica, Torino, Bollati.
  • Cieśliński, C., 2017, The Epistemic Lightness of Truth. Deflationism and its Logic, Cambridge, Cambridge University Press.
  • David, M.,1994, Correspondence and Disquotation. An Essays on the Nature of Truth. Oxford, Oxford University Press.
  • De Fioro, C., 2013, La forma della verità. Logica e filosofia nell’opera di Alfred Tarski, Milano, Mimesis.
  • Glanzberg, M., 2018, ed. The Oxford Handbook of Truth, Oxford, Oxford University Press.
  • Gruber, M., 2016, Alfred Tarski and the “Concept of Truth in Formalized Languages. A Running Commentary with Consideration of the Polish Original and the German Translation, Dordrecht, Springer.
  • Halbach,V., 2011, Axiomatic Truth Theories, Cambridge, Cambridge University Press.
  • Horsten, L., 2011, The Tarskian Turn. Deflationism and Axiomatic Truth, Cambridge, Mass., The MIT Press, Cambridge, Mass.
  • Kirkham, R. L., 1992, Theories of Truth. A Critical Introduction, Cambridge, Mass, The MIT Press.
  • Künne, W., 2005, Conceptions of Truth, Oxford, Oxford University Press.
  • Martin, R. L., 1984, ed., Recent Essays on Truth and the Liar Paradox, Oxford, Clarendon Press.
  • Moreno, L. F., 1992, Wahrheit und Korrespondenz bei Tarski. Eine Untersuchung der Wahrheitstheorie Tarskis als Korrepondenztheorie der Wahrheit, Würzburg, Köningshausen & Neumann.
  • Pantsar, M., 2009, Truth, Proof and Gödelian Arguments. A Defence of Tarskian Truth in Mathematics, Helsinki, University of Helsinki.
  • Patterson, D., 2012, Alfred Tarski Philosophy of Language and Logic, Hampshire, Palgrave Macmillan.
  • Patterson, D. 2008, ed., New Essays on Tarski and Philosophy, Oxford, Oxford University Press.
  • Puntel, L. B.,1990, Grundlagen einer Thorie der Wahrheit, Berlin, de Gruyter.
  • Rojszczak, A., 2005, From the Act of Judging to the Sentence. The Problem of Truth Bearers from Bolzano to Tarski, Dodrecht, Springer.
  • Simons, P., 1992, Philosophy and Logic in Central Europe from Bolzano to Tarski. Selected Essays, The Hague, M. Nijhoff.
  • Stegmüller, W., 1957, Das Wahrheitsbegriff und die Idee der Semantik, Springer, Wien.
  • Tarski, A., 1933, Pojęcie prawdy w językach nauk dedukcyjnych, Warszawa, Towarzystwo Naukowe Warszawskie, Warszawa; Germ. tr. (with additions), Tarski 1935, Eng. tr. Tarski 1956a.
  • Tarski, A., 1935, Der Wahrheitsbegriff in den formalisierten Sprachen, Studia Philosophica I (1935), pp. 53–198 [German tr. of Tarski 1933).
  • Tarski, A. 1936, Grundlegung der wissenschaftlichen Semantik. In Actes du Congrès international de philosophie scientifique, Paris 1935, fasc. 3: Semantique, Paris, Herman, 1–14; Eng. tr. in Tarski 1956, p. 401–408.
  • Tarski, A. 1936a, Über den Begriff der logischen Folgerung. In Actes du Congrès international de philosophie scientifique, Paris 1935, fasc. 7: Logique, Paris, Herman, p. 1–11; Eng. tr. in Tarski 1956, 409–420.
  • Tarski, A. 1944, The Semantic Conception of Truth and the Foundations of Semantics., Philosophy and Phenomenological Research 4, 341-395; reprinted in Tarski 1 Collected Papers, v. 2, Birkhäuser, Basel, pp. 665¬–699.
  • Tarski, A. 1956, Logic, Semantics, Metamathematics. Papers of 1923 to 1938, Oxford, Clarendon Press; 2nd ed., Hackett Publishing Company, Indianapolis,
  • Tarski, A., 1956a, The Concept of Truth in Formalized Languages. In Tarski 1956, 152–278 [Eng. tr. of Tarski 1935].
  • Tarski, A., 1969. Truth and Proof. L’age de la Science 1, 279–301; reprinted in Tarski 1986, v. 4, 399–422.
  • Tarski, A., 1986, Collected Papers, v. 1–4, Basel, Birkhäuser.
  • Tarski, A., Vaught, R., 1957, Arithmetical Extensions of Relational Systems. Compositio Mathematica, 13, 81–102; reprinted in Tarski 1986, v. 4, 651–682.
  • Woleński, J., Köhler, E., 1999, eds., Alfred Tarski and the Vienna Circle. Austro-Polish Connections in Logical Empiricism, Dordrecht, Kluwer.

Author Information

Jan Woleński
Email: jan.wolenski@uj.edu.pl
University of Information Technology, Management and Technology
Poland

Paradigm Case Arguments

From time to time philosophers and scientists have made sensational, provocative claims that certain things do not exist or never happen that, in everyday life, we unquestioningly take for granted as existing or happening. These claims have included denying the existence of matter, space, time, the self, free will, and other sturdy and basic elements of our common-sense or naïve world-view. Around the middle of the twentieth century an argument was developed that can be used to challenge many such skeptical claims based on linguistic considerations, which came to be known as the Paradigm Case Argument (henceforth, the PCA).

Consider, for instance, the following argument from a skeptic who denies that there are cases of seeing people. First, it cannot be said that we see the people who walk our streets, since they are mostly covered with clothes. All that we see, strictly speaking, are their faces and hands. But to see any such people stripped naked would be little better, since we then would be seeing only their facing surfaces while only imagining or anticipating, not seeing, their rear sides. With well-placed mirrors we might be able to see all their sides at once, but we are still seeing only their exterior, which does not constitute the whole person. No, to see these people proper we would need to have them opened up, with all their interior parts displayed for us too. But then we would no longer have a person, but a corpse or a display of people-parts. Hence there are no cases of seeing people.

A philosopher using the PCA could then counter this by pointing out that it is in fact a perfectly natural and proper use of the word ‘see’ to say that you see a person in ordinary cases where you are looking at a fully intact person with his or her clothes on. She might then, if necessary, describe situations where we do or would say this. She might point out that we teach or train children and also adults who are learning English how to use the expression ‘see a person’ with reference to everyday cases when one sees them clothed. (Teacher: ‘What do you see on page seven?’ Learner: ‘A person.’ Teacher: ‘That’s correct.’)  These are paradigm cases of seeing people, exemplars that we use when teaching and explaining the meaning of that expression. That being so, there is no logical room for a philosophical argument showing that these are not cases of seeing people. Trying to argue that they are not would be like trying to argue that the paintings of Picasso that the term ‘cubism’ was coined to denote are not cubist (because they do not depict geometrically exact cubes, say).

This article shows the PCA being applied to the more controversial topic of free will skepticism, examines its logical structure, and looks at some common objections to it. The appraisal of the PCA leads to issues of some depth and importance.

Table of Contents

  1. History and Significance of the Argument
  2. Paradigm Cases
  3. The PCA as Part of a Wider Response to the Skeptic
  4. Malcolm’s Version of the PCA
  5. Flew’s Version of the PCA
  6. Critical Responses to Flew’s PCA
    1. Challenging the First Premise
    2. Challenging the Second Premise
    3. The Charge of Irrelevance
  7. “Ordinary Language is Correct Language”
  8. Ordinary Usage as Practices
  9. Conclusion
  10. References and Further Reading

1. History and Significance of the Argument

The PCA is closely associated with the linguistic philosophy movement that peaked in the mid-twentieth century, when many philosophers were urging that philosophical questions and problems should be approached by paying careful attention to the language that we use for expressing them. More specifically, it was associated with the ordinary language philosophy approach within that broader movement, where the emphasis was on examining the ordinary use of terms. Both advocates and critics of the PCA have claimed that it is foundational to those philosophical outlooks and key to understanding them (for example, Flew 1966, p. 261; Gellner 1959, pp. 30–32; Parker-Ryan 2010, p. 123).

The first explicit presentation of the PCA was in a classic paper of the ordinary language philosophy tradition by Norman Malcolm, originally published in 1942, called ‘Moore and Ordinary Language’ (also see Malcolm 1963). Malcolm studied under and was influenced by G. E. Moore and Ludwig Wittgenstein at Cambridge. He then returned to the USA and became a leading exponent of Wittgenstein’s philosophy there. He believed that the PCA was inchoate in Moore’s famous ‘proof’ (1939) of an external world, and he also stated (1963, p. 183) that grasping it was essential for understanding some of Wittgenstein’s most distinctive remarks on the nature of philosophy, such as, ‘Philosophy must not interfere in any way with the actual use of language, so it can in the end only describe it. For it cannot justify it either. It leaves everything as it is’ (Wittgenstein 2009/1953, §124). Anthony Flew was another prominent early exponent of the PCA, who applied and defended it in a series of articles beginning in the 1950s.

The argument was employed by Malcolm, Flew, and others to defend the existence of a variety of things from skeptical attack, such as cases of acting freely (Black 1958; Danto 1959; Flew 1954 & 1955a; Hanfling 1990; Hardie 1957), causation (Black 1958), solidity (Stebbing 1937; Urmson 1953), space and time (Malcolm 1992/1942), material things and perceptions of material things (Malcolm 1992/1942; 1963), and certain knowledge of empirical propositions (Malcolm 1992/1942). For convenience, in what follows people who argue against the existence of such things are called ‘skeptics’, and people who use the PCA to counter such arguments are called ‘defenders’.

2. Paradigm Cases

The PCA exploits the idea of a paradigm case. Minimally, a paradigm case of something is a case that is supposed to come within the denotation or extension of the relevant word. But what is more, it is supposed to centrally come within its denotation; it is supposed to be a model example or exemplar, something about which we are inclined to say, ‘That’s an X if anything is’ or ‘If that’s not an X, I don’t know what is’. It is the kind of case that psychologists who study concepts would call a ‘prototypical category member’ and which has been found to be associated with various psychological phenomena, such as tending to first spring to mind when people are told to think of examples of an X, or being more rapidly categorized as an X compared to other category members in categorization tasks. This exemplar status makes it especially fit for the purpose of explaining the meaning of the relevant word in ostensive definitions (and its being used for that purpose reinforces its exemplar status in turn).

A particularly striking example of a paradigm case in this sense (an exemplar of an exemplar, if you will) might be the International Prototype of the Kilogram, a lump of platinum kept in Paris that was used to define what a kilogram is, such that anything else was a kilogram in weight if and only if it was the same weight as this object. The cases that the defender refers to as paradigm Xs are thought of as playing a similar meaning-setting role in relation to the relevant term ‘X’ (though this comparison has its limits; for example, the cases might not have come to play that role through explicit stipulation or formal decision). The problem, then, that the defender has with the skeptic is that in denying that there are any Xs, the skeptic seems to be denying that what apparently are paradigm cases of Xs are Xs, which would be analogous to denying that the International Prototype of the Kilogram is a kilogram in weight.

3. The PCA as Part of a Wider Response to the Skeptic

Of course, when the skeptic denies that there are any Xs, he does so due to some reasons or arguments. The PCA, however, does not directly engage with the arguments that the skeptic gives or the significant complexities they can give rise to. This is because, from the defender’s perspective, the skeptic’s claims can ‘be seen to be false in advance of an examination of the arguments adduced in support of them’ (Malcolm 1963, p. 181; also see Malcolm 1992/1942, p. 114), since the PCA is supposed to show that the skeptical claim must be wrong. In other words, for the defender, the skeptical argument (assuming it is logically valid) should be regarded as a reductio ad absurdum of a premise in the argument, since it leads to an absurd or impossible conclusion.

It is this apparently brusque way of treating the skeptic’s arguments that provoked suspicion and even hostility towards the PCA on the part of some critics. Thus some have sarcastically referred to it as a ‘remarkably economical device for resolving complex philosophical disputes’ (Beattie 1981, p. 78), or as ‘a very simple way of disposing of immense quantities of metaphysical and other argument, without the smallest trouble or exertion’ (Heath 1952, p. 1). For others it seems to take the fascination and wonder out of philosophy by its summary rejection of intriguing claims and arguments (Watkins 1957a, p. 26). Why the defender feels entitled to treat the skeptic’s arguments in this way is explained in section eight.

Defenders do not give the skeptic’s arguments quite the short shrift that these remarks suggest, however, since they see the PCA as being only a part of an adequate philosophical response to the skeptic. Accordingly, both Malcolm and Flew stated that to truly free us from the skeptic’s position, reminding us of ordinary linguistic usage is not enough. We also need to reconstruct and examine the reasoning (Malcolm 1951, p. 340; 1992/1942, p. 123) or to identify the ‘intellectual sources’ (Flew 1966, p. 264) that drew us towards the skeptical conclusion. (The importance of this is especially evident in the free will debate, where even philosophers who sympathize with the PCA defense of free will can still feel troubled by the skeptical arguments.) This part of the response to skepticism involves examining the skeptical arguments, and it can also involve unearthing any unstated presuppositions, comparisons, or pictures that might be informing those arguments. Sometimes these sources get their intellectual power over us precisely from the fact that we are not explicitly conscious of them, and they can lose this power when we become conscious of them (Wittgensteinians sometimes call this the ‘therapeutic’ part of the investigation). For instance, regarding the argument that we never see people—a sort of argument that is not unprecedented (see Campbell 1944–45, pp. 14–18; Descartes 2008/1641, p. 23)—the implicit assumption might be that in order to truly see something you must see all its parts or aspects, or the implicit comparison might be with cases of seeing a movie or a play, which one has not properly done unless one has seen it from beginning to end (if we miss a bit, we qualify our statement: ‘I saw most of it’). In sum, defenders believe that ‘the application of a PCA is only a begin-all and not a be-all and end-all of the satisfactory treatment’ of the skeptic’s challenge (Flew 1982, p. 117; 1966 pp. 264-265).

It is also recognized by some defenders that identifying the paradigm cases of something is a far cry from giving an account or theory of it. If something is a paradigm case of an X it is so because of certain features that it has and does not have, and philosophers often want to know what these features are, though they cannot simply be ‘read off’ some paradigm cases. Identifying paradigm cases can then be only a ‘jumping-off point for establishing the relevant rules and conventions’ (Black 1973, p. 271) governing the term, and a preliminary to developing an alternative account of the phenomenon to the one implicit in the skeptic’s argument.

4. Malcolm’s Version of the PCA

A close reading of the literature on the PCA reveals that there is not one but two different kinds of arguments that go by the name ‘paradigm case argument’, the first of which is especially evident in Malcolm’s 1942 paper and which is of more limited application. Distinguishing between these versions is important as not doing so can lead to confusion in the critical appraisal of these sorts of arguments.

The key feature of what we may call ‘Malcolm’s version’ is that it exploits the idea that there are certain expressions ‘the meanings of which must be shown and cannot be explained’ (Malcolm, 1992/1942, p. 120). Color terms are often mentioned to illustrate this; to make someone fully understand what ‘yellow’ means you must go beyond verbal explanations and produce a sample. Consider, for instance, a philosopher who claims that space and time do not exist. Malcolm first uses Moore’s method of ‘translating into the concrete’ (Moore 1918, p. 112), where an abstract statement is considered in terms of its specific implications. Thus he understands this as amounting to the denial that anything is ever to the left of anything else, that anything is ever above anything else, that anything ever happens earlier or later than anything else, and so on. It is the denial that such states of affairs ever exist. Furthermore, for a philosopher to actually make such a denial (as opposed to just parroting words), she must understand the meanings of the expressions contained therein. She must understand what it means to say that one thing is under another, that one event occurred after another, and so forth.

But how, Malcolm asks, could one ever have come to understand the meaning of such expressions as ‘after’, ‘to the left of’, ‘above’, and ‘under’? Only, he maintains, by our being shown or being acquainted with actual instances (or ‘paradigms’) of things being to the left of other things, of things being above other things, and so on (1992/1942, p. 120). Therefore, for Malcolm, spatial and temporal relations must exist for us to understand the meanings of such expressions and thus, ironically, the existence of space and time is a precondition for the possibility of denying their existence. Or at least the skeptic owes us an explanation of how he can understand spatial and temporal vocabulary on the assumption that spatial and temporal relations do not exist (Soames 2003, p. 166).

The skeptic could respond, however, by simply denying that he understands spatial and temporal vocabulary. That is, the skeptic’s claim might be that such vocabulary has no intelligible meaning, a claim which he perhaps misleadingly expressed by saying ‘Space and time don’t exist’ (as misleading as it would be to say ‘Square circles don’t exist’, as if to imply that there is an intelligible description there that nothing happens to satisfy). And Malcolm does suggest something of this sort in saying that the skeptic’s real point is that these ideas are subtly self-contradictory. However, Malcolm claims that no expression that has a descriptive use is self-contradictory, and he maintains that these expressions do have descriptive uses.

Taking their cue from Malcolm, some commentators have interpreted the PCA as applying only to expressions whose meanings are so fundamental or irreducible that they can be conveyed only ostensively (for example, Alexander 1958, p. 119). Certain defenders were then reproached for attempting paradigm case arguments with expressions apparently not of this type (Passmore 1961, p. 115; Watkins 1957a, p. 29). For instance, the most intense discussion of the PCA was in relation to the expression ‘free will’, which should probably not be regarded as this kind of expression. It was noted that the meanings of certain expressions can be formed and learned by our associating them with an abstract specification or definition. In other cases, our understanding can be derived from examples, but examples that are fictional, like when we learn what miracles are by reading about miraculous events in myths and stories (Watkins 1957a, p. 27). In both cases it remains an open question whether the expression denotes anything real. Given that ‘free will’ could be an expression of those types, no inference can be made from the fact that ‘free will’ has a meaning or is understood by us to the conclusion that there is free will.

However, a different version of the PCA exists that does not rely on the idea that the meaning of the relevant expression ‘must be shown and cannot be explained’. To see this, we will look in some detail at how the PCA works in relation to the controversial topic of free will skepticism.

5. Flew’s Version of the PCA

Next we will examine a particular application of the PCA, Anthony Flew’s use of it to rebut skepticism about actions done of one’s own free will, which we may call ‘free actions’ for short. By focusing on a particular application, and the one that has generated the most discussion, we can examine the argument’s logical features in some depth. The following quotations, then, are Flew’s presentation of it from his earlier papers on the topic. Though these were the most frequently quoted and discussed presentations of the PCA, we will see that they were problematic and that he reached a more mature understanding of it in his later work. These problems largely stem from clinging to Malcolm’s model of the PCA with a concept for which it is not appropriate.

Crudely: if there is any word the meaning of which can be taught by reference to paradigm cases, then no argument whatever could ever prove that there are no cases whatsoever of whatever it is. Thus, since the meaning of ‘of his own freewill’ can be taught by reference to such paradigm cases as that in which a man, under no social pressure, marries the girl he wants to marry (how else could it be taught?): it cannot be right, on any grounds whatsoever, to say that no one ever acts of his own freewill. For cases such as the paradigm, which must occur if the word is ever to be thus explained (and which certainly do in fact occur), are not in that case specimens which might have been wrongly identified: to the extent that the meaning of the expression is given in terms of them they are, by definition, what ‘acting of one’s own freewill’ is. (Flew 1955a, p. 35)

Here is another more concise statement of the argument:

As the meaning of expressions such as ‘of his own free will’ is and must ultimately be given by indicating cases of the sort to which it is pre-eminently and by ostensive definition applicable, and not in terms of some description (which might conceivably be found as a matter of fact not to apply to anything which ever occurs); it is out of the question that anyone ever could now discover that there are not and never have been any cases to which these expressions may correctly be applied. (Flew 1954, p. 54)

There are at least two errors with this. Firstly, Flew claims in places that the meaning of ‘free will’ must be given by referring to paradigm cases. But this is not right. As suggested above, it seems possible that its meaning could be given with a definition (‘A free action is an action that . . .’). It would then be an open question whether there is anything satisfying the definition. Flew came to think that this ‘must’ claim was unnecessarily strong, and that for his argument to work it is enough that the meaning of ‘free action’ can be given by referring to paradigm cases (1957, p. 37).

But secondly, even if the meaning of ‘free action’ can be given by referring to paradigm cases, that would not entail that there must be cases of free action (that is, Flew is wrong in saying that the paradigm cases ‘must occur if the word is ever to be thus explained’). For cases can be real or hypothetical, and it is not necessary that the paradigm cases occur for it to be possible to explain the meaning of a term by describing them (Chisholm 1951, pp. 327–328; Hallett 2008, p. 86). Indeed, even Flew himself, in the first passage, seems to describe a hypothetical case of a man who under no social pressure marries the woman he wants to marry to explain the meaning of ‘free will’ (at least he does not tell us that he is referring to some actual case he is familiar with). We all know that such cases occur of course, but it is a contingent fact that they do (our world might have been one where all marriages were arranged and obligatory) and that fact has no bearing on the pedagogical usefulness of the case.

Thus it would not be the mere fact that the meaning of ‘free action’ is or can be explained in terms of paradigm cases that guarantees that there are free actions. It would, rather, be the fact that the meaning of ‘free action’ can be explained in terms of certain paradigm cases, plus the fact that such paradigm cases actually occur which would guarantee that there are free actions. This two-step structure of the PCA is noted by Marconi when he says, ‘it is not enough, to refute skepticism about miracles, that the turning of water into wine would be ordinarily described as a miracle, for it is far from uncontroversial that such an event ever took place’ (2009, pp. 118–119).

Flew elucidates the structure of the argument along these lines, and achieves a more mature understanding of the PCA, in a later paper. There he says that the ‘logical form of this argument type consists in two steps: The first is an insistence upon (what is taken to be) a plain matter of fact [that is, that certain cases exist or happen] . . . The second step consists in the assertion that examples such as those presented just are paradigm cases of whatever it is which it is being so paradoxically denied’ (1982, p. 116; also see Donnellan 1967, p. 108). Thus Flew’s paradigm case argument for free actions consists of two premises.

P1: As ‘a plain matter of fact’, cases exist where a man marries the woman he loves and wants to marry without threats, pressure, or compulsion.

P2: Such cases are paradigm cases of free actions.

            Conclusion: Free actions exist.

Here we can see that one of the premises is an existential statement, with the other saying that the thing quantified over is a paradigm case of whatever the skeptic is denying. In other words, one premise says that there exist cases matching a particular description, while the other says that anything matching such a description is a paradigm case of an X (where ‘X’ refers to what the skeptic claimed not to exist). Together they yield the conclusion that there are Xs.

But that is not all, since the PCA is known to draw on linguistic considerations somehow. This is not evident in the above argument schema, so where do they enter into it? They enter into it, it seems, in justifying the second premise. Thus the defender will say that those cases are paradigms of free actions because the meaning of ‘free action’ is taught or explained with reference to such cases, or because we ordinarily say of such cases that the agent ‘acted of his own free will’.

The justificatory significance of ordinary linguistic usage is discussed below. But now that we have identified the basic structure of Flew’s argument, let us first look at the various avenues of criticism available to the skeptic.

6. Critical Responses to Flew’s PCA

a. Challenging the First Premise

Critics of Flew’s PCA have tended to grant premise 1 as just being an uncontroversial empirical truth. Yet perhaps premise 1 could be resisted if we insist on understanding ‘compulsion’ or ‘being forced/constrained’ in a particular way, such that any kind of deterministic cause ‘compels’ its effect or ‘forces’ the effect to happen, so that nobody could act without compulsion in a deterministic universe (see Beebee 2013, p. 110; Hardie 1957, p. 21). Here the analytic effort would move to the ideas of compulsion or of being forced, which would need to be clarified. So although the premise here is supposed to be a statement of plain empirical fact, it could be challenged through the development of a conceptual point.

b. Challenging the Second Premise

But the main focus of attention has been on premise 2. Are such marriages indeed paradigm cases of acting freely? Or if we tend to judge that they are, is this only because of certain assumptions we are making about those cases that were unmentioned in Flew’s description, assumptions that might be open to challenge?

Some critics have argued that advocates of the PCA err by assuming a sharp distinction between teaching the meaning of a word by presenting cases and by giving criteria. For mixtures of these can also occur when we explain the meaning of a word with reference to cases, but cases that are interpreted as satisfying certain criteria (Ayer 1963, pp. 17–18; Gellner 1959, p. 34; Passmore 1961, pp. 115–116). Consider, for instance, a superstitious society where people believe in miracles. There, when explaining what a miracle is, people might refer to cases such as when the leader suddenly and inexplicably recovered from a grave illness, and others involving a sharp turnaround in fortune, but it is being assumed that these turnarounds satisfy the description of being caused by the intervention of a spiritual being. Notice that here the meaning of ‘miracle’ is being explained with reference to real cases, but this does not prove that there are miracles. For the cases are being interpreted in a certain way and the interpretation could be wrong. Could it be the same with the marriage cases? Do we think they are cases of acting freely only because of some contentious background features that we assume to apply to them?

This thinking is evident in David Papineau’s criticism of the PCA when he says, ‘Maybe ordinary people are happy to apply the term “free will” to such actions as drinking a cup of coffee or buying a new car. But this is only because they are implicitly assuming that these actions are not determined by past causes. But in fact they are wrong in this assumption. All human actions are determined by past causes’ (1998, p. 133). Similarly, John Passmore grants that it is natural for us to describe grooms as acting freely in the circumstances described by Flew, but he adds that ‘we have also learned criteria: we have been told that a person acts of his own free will only when his action proceeds from an act of will . . . [with] the metaphysical peculiarity of being uncaused’ (1961, p. 118; also see Ayer 1963, p. 18; Lucas 1970, p. 12). Passmore’s implication is that in saying that the groom acted freely, we are implicitly assuming that he satisfied this criterion.

Note that these philosophers are making claims about what ordinary speakers mean when they talk of free actions, and thus about the ordinary or ‘folk’ concept of free action, saying that it involves the idea of an uncaused or undetermined act. They are, in that respect, engaging in ‘ordinary language philosophy’ with Flew, and disputing his (more implied than stated) characterization of the ordinary concept. However, it is not enough for them to simply claim that this is a feature of the ordinary concept of a free action. There is an onus on them to support that claim with methods or evidence appropriate for this task.

But what support could they provide? An old-school ordinary language philosopher like Flew would appeal to ordinary linguistic usage to support the idea that free action is, roughly, doing what you want to do without pressure or duress, pointing out that this explains the fact that we say of a groom who marries the woman he loves and wants to marry that he marries of his own free will, but not of the groom in an arranged marriage or shotgun marriage. As an old-schooler, moreover, he would be confident that he knows well what the ordinary use of ‘free will’ is just by being fluent in English. Others who think that philosophy should be more ‘scientific’ in its methods would think it necessary to gather some empirical data on ordinary speakers’ judgments through surveys. (Interestingly, one such study yielded ideas similar to Flew’s; see Monroe and Malle 2010.) However, Papineau’s and Passmore’s criterion—that a free action is one not determined by past causes—does not seem to explain this usage at all. For we might not doubt that in both happy marriages and ones involving coercion the groom’s saying ‘I do’ can be causally explained—crudely, by love in the former and fear in the latter—and that neither sort of explanation is any less deterministic than the other. We would not speak of these cases differently if this was our criterion of free action, and it is not clear what practical usefulness the expression would have on that understanding.

Another kind of support for claims about what speakers mean or are implicitly assuming is the speakers’ own admissions or acknowledgments. When someone describes an event as a miracle, for instance, we can elicit his acknowledgment that in doing so he was thinking that a deity intervened. But will we be able to elicit from an ordinary speaker the acknowledgment that when he said that Debora married of her own free will, he meant that her marrying was not determined by past causes? Can we regard something as part of what a person meant in saying something if he does not acknowledge it as part of what he meant? Papineau and Passmore would need to allay the suspicion that their characterization of the ordinary meaning of ‘free action’ is an imposition from philosophical theory. It is not clear, for instance, where exactly we have ‘been told’ the criteria for free action that Passmore says we have been told, besides in the philosophy classroom.

Of course, these critics’ assumption that a free act is uncaused or undetermined must have come from somewhere, and Flew and Malcolm insisted that a thorough investigation of the ‘intellectual sources’ of the skeptic’s claim must be carried out, to identify the comparisons, pictures, analogies, and so forth that lure us towards it. Any PCA will seem shallow without this concomitant.

To sum up, these ways of challenging the paradigm case argument involve contesting the defender’s claim about what the relevant expression ordinarily means. But this requires that the skeptic play and beat the ordinary language philosophers (in the wide sense of those who work on elucidating the meanings of ordinary expressions, which could include certain experimental philosophers) at their own game. Skeptics who dispute a defender’s claim about what ordinary speakers identify as the paradigm cases of something, or about what exactly ordinary speakers are assuming in making such identifications, must supply evidence appropriate for determining the character of ordinary concepts, a burden which, of course, also applies to the defenders.

Another philosopher who questioned whether Flew’s description identifies a paradigm case of free action is MacIntyre (1957). Suppose we are told that the groom’s falling in love with the bride was due to a hypnotic suggestion (assuming such things are authentic). MacIntyre maintains that in that case, he would not have married of his own free will (though it could be autonomy that is lacking here, rather than free will; on this distinction, see Christman 2015, section 1.1; Piper 2010, section 2c). The defender would reply that though such an etiology was not explicitly ruled out by Flew’s description of the case, we were supposed to imagine that this was an ordinary case and thus that no such extraordinary things happened. But to this MacIntyre says that there ‘is no relevant difference in the logical status between explanations in terms of endocrine glands [or whatever the explanation is in ordinary cases] and those which refer us to hypnotic suggestion’ (1957, p. 31).

This kind of move—claiming that there is no important difference between putative paradigm cases of free action and of unfree action—is a familiar one from free will skeptics, and it is independent of the particulars of the paradigm case argument. It also leads to stalemate, since given that sameness and difference are symmetrical relations we can argue the other way around just as cogently: we can take our intuitions about the free action case for granted and say that because the unfree action case is no different in its essentials, it is, despite initial appearances, a case of free action (see Beebee 2013, p. 85).

c. The Charge of Irrelevance

Other critics have taken a different, more concessionary approach to dealing with the PCA over the free will issue. Rather than contesting Flew’s characterization of the ordinary meaning of ‘free will’, they agree with it, but maintain that this is just not the concept of free will that is relevant to the philosophical debates. For instance, Danto agrees with Flew that ‘when, in ordinary contexts, we say that Smith married of his own free-will, we mean only that there was no shotgun being pointed at him by an angry father (or something like this). We do not deny that marriages are predictable, or even that this marriage was’ (1959, p. 124). We just mean that he was not made to do it against his will, pressured or strong-armed into doing something he did not want to do (Ibid., p. 123). However, ‘ordinary language so construed is simply irrelevant to the celebrated problem of the freedom of the will’ (p. 121), which is a ‘metaphysical problem’ that can be solved only with a ‘metaphysical solution’ (p. 124). Similarly, some philosophers have been explicit in saying that the free will that philosophers are curious about is not the free will that we speak of in daily life (Hardie 1957, p. 30; van Inwagen 2008, p. 329, note 1). Relatedly, others try to distinguish freedom of action from freedom of will and shift the debate towards the latter idea (see McKenna and Pereboom 2016, p. 10). The former idea roughly corresponds to what Flew was talking about, while the latter is supposedly something quite different and concerns choice or decision rather than action, and is less in common currency.

Though the sharp disparity between the views of the defender and the skeptic would be well explained by this idea that they are ‘talking past each other’, operating with different notions, there is a problem with it. There is an unwritten rule (or a ‘conversational maxim’, to use a Gricean expression) that we must tell our readers that we are using some expression in an unusual sense if we are doing so. This is to prevent misunderstanding and confusion, since we naturally interpret a person’s words to have their ordinary signification unless told to do otherwise. However, most philosophers, not to mention psychologists and neuroscientists, do not say that they are using ‘free will’ or ‘free action’ in some special or unusual sense in their written works on this topic. Thus, if they are doing this, then many of them are being irresponsible by not being upfront about it. This omission would be excusable if it were common knowledge that ‘free will’ is being used in some non-standard sense in the literature, but this is hardly true, especially considering that some philosophers have said the exact opposite: that in the free will debate we are investigating whether free will exists as ordinarily conceived (see, for example, Jackson 1998, p. 31).

In light of these conflicting indications, it is simply not clear whether in the debates about the existence of free action it is free action in the ordinary sense that is being discussed. One way to find clarity on this, however, might be through reflection on the related phenomenon of moral responsibility. Most philosophers have not been interested in free will just for its own sake but because of its importance for moral responsibility, believing that whether we can be held morally accountable for our actions, and can be deserving of praise and blame, turns on whether we can act freely. Thus, to the question ‘What sense of free will are you talking about?’, some might reply, ‘The one that matters for moral responsibility’. However, this might not be of great help because even if there is some ‘metaphysical’ notion of free will that is critical for moral responsibility, the ordinary notion of free will is also important for it. For ordinarily if we are told that someone did something terrible, but are then told that he did not do it of his own free will, we will (if we believe this) infer that he is less responsible for having done it.

7. “Ordinary Language is Correct Language”

Let us look again at premise 2 of Flew’s PCA. This stated that cases matching a certain description are paradigm cases of free action. But how does a defender support such a claim? By referring to linguistic considerations. By saying that these are the kinds of cases that we ordinarily or standardly call ‘free actions’, or that these are the kinds of cases that we would refer to when teaching or explaining the meaning of ‘free action’. Furthermore, we can take the former to be the most fundamental consideration because the meaning of a term can be taught or explained correctly or incorrectly, depending on whether the instruction reflects the ordinary use, and besides, much of our native language is not learned from explicit instruction.

But can we safely infer from the fact that a certain sort of case or thing is ordinarily called ‘X’ that it is in fact an X? It seems easy to find reasons to dismiss this principle. After all, didn’t people in superstitious societies ordinarily refer to certain events as miracles, or to the Sun as a deity, while being incorrect in saying those things?

The idea that if something is ordinarily called ‘an X’ then it is an X was expressed by Malcolm in his statement that ‘ordinary language is correct language’ (Malcolm 1992/1942, p. 118, p. 120), which came to be regarded as a central slogan of ordinary language philosophy. As a slogan, however, this needs deciphering. Malcolm explained what he meant in saying this by distinguishing between two kinds of mistakes that can be made when making a statement, being mistaken about the facts, and using incorrect language (1992/1942, p. 117). The distinction can be illustrated with a case adapted from Malcolm. Suppose that Jones and Smith see an animal in some bushes at a distance, and Jones claims it is a wolf while Smith claims it is a fox. After it emerges from the bushes, Jones clearly sees that it has the characteristics of a fox and that he was mistaken. This was a factual mistake. But imagine another case where they both see the animal clearly and are in full agreement on what its characteristics are, though Jones claims it is a wolf while Smith claims it is a fox. Though the form of their disagreement is the same as before, we now have a linguistic rather than a factual disagreement: they disagree about what a thing of this sort is called. At least one of them is mistaken about the meaning of these words. (Though Malcolm contrasts ‘factual’ with ‘linguistic’ disagreement here, he would not deny that a linguistic mistake is based on a factual error (see Malcolm 1940). That a word has the particular meaning that it has is, of course, a kind of fact. This contrast might therefore be better described as one between linguistic and non-linguistic facts, and one might want to press Malcolm to clarify it further.)

But then Malcolm asks us to imagine the second disagreement again, though with Jones acknowledging that an animal of this sort is ordinarily called ‘a fox’ while maintaining that it is nevertheless incorrect to call it that and correct to call it ‘a wolf’. According to Malcolm, this would be absurd. It is absurd, he says, because ordinary language is correct language. To refute Jones’ claim here it suffices to say, ‘But that’s not what people call it.’

In his discussion of the paradigm case argument, Diego Marconi criticizes this view. He agrees that if some things are correctly called ‘Xs’ then they are Xs (2009, p. 116). But he disagrees that if some things are ordinarily called ‘Xs’ then they are correctly called ‘Xs’. For people might only be calling them ‘Xs’ because they appear to be Xs when in fact they are not Xs (p. 119). This seems right as far as it goes. However, if people are always calling some things ‘Xs’ because they appear to be Xs while not being Xs, then they are like Jones who called a fox ‘a wolf’ because it appeared to be a wolf to him: they are factually mistaken. Malcolm’s idea was that if some things are ordinarily called ‘Xs’ and if no factual mistakes are being made about them, then they are Xs. That is, Malcolm’s slogan represented an attempt to characterize a notion of linguistic correctness, saying that, assuming no factual mistakes are being made about it, the correct thing to call something is what everyone calls it (but for a hard case, see Watkins 1957a, p. 28). The factual/linguistic error distinction is indispensable for understanding the slogan.

8. Ordinary Usage as Practices

It is possible to gain a deeper understanding of why the defender puts so much weight on ordinary usage. But first let us return to an earlier point. We saw earlier that according to the defender, the PCA allows us to reject the skeptical position that there are no Xs without having to examine the skeptical argument. What is the source of this supposed imperviousness to skeptical argument? Can such an apparently dogmatic attitude be tolerated in philosophy? Consider again the skeptic who argued that there are no cases of seeing people. The defender responded by making the simple point that we ordinarily say that we see people in cases where we look at them clothed, cases that were deemed not to be cases of seeing people by the skeptical argument. But why exactly does the fact that we ordinarily say that make it correct to say that? And why should that ordinary usage be unassailable?

The reason is that the defender thinks she is describing what could be called a linguistic practice, custom, convention, or rule. She is trying to point out that it is our practice or custom, or a rule of our language, to call cases of this sort cases of seeing people. Now such things as practices, customs, or rules are open to criticism in various ways. For instance, a rule of a game can be criticized for making the game too long, too complicated, too inconvenient, too dangerous, or less exciting, and rules are sometimes changed to improve games along these lines. But it cannot be criticized for being incorrect, since practices, customs, or rules cannot be correct or incorrect.

Consider the rule in chess that the bishops can move only diagonally, for instance. What sense can there be in saying that this rule is correct? It is, indeed, one of the rules of chess. It is correct to say that this is a rule of chess. The statement that this is a rule of chess is correct. A move may be correct by being in conformity with it. But the rule itself is not correct; it is simply followed, and its being followed makes it one of the rules of chess (though something can also be a rule in virtue of being decreed by a relevant authority, even if people ignore it). Admittedly, we might sometimes speak loosely of a ‘correct rule’. But ‘correct’ here is redundant; ‘These are the correct rules of chess’ is just an emphatic way of saying, ‘These are the rules of chess’. For we have no understanding of what an incorrect rule of chess would be. Would moving the bishop vertically and horizontally be an example? No, since we can reprimand someone doing that by saying, ‘That’s not the rule for the bishop’. (It would confuse him to say ‘That is indeed a rule for the bishop, but an incorrect one’.)

So when a defender says, ‘We (ordinarily) call cases of this sort cases of seeing a person’, she is trying to say, ‘It is our practice/custom/rule to call cases of this sort cases of seeing a person’, and as such it is not the kind of thing that could be refuted by an argument. It is not something that could be proven by any argument either, just as a rule of chess can be neither proven nor refuted (though statements as to what are the rules of chess can be proven or refuted). Wittgenstein called this ‘bedrock’, where ‘I am inclined to say: “This is simply what I [or better, what we] do”’ (2009/1953, §217; also see §654). As practices or rules of our ‘language-game’ they are self-standing; they are things that philosophers ‘cannot justify’ in an evidential sense and must ‘leave as they are’.

But if a linguistic practice cannot be correct or incorrect, how does this help the defender? For didn’t the defender want to claim that it is correct to say that such-and-such a case is a case of seeing a person? Indeed, but note what she is claiming here: that it is correct to say that such-and-such a case is a case of seeing a person. The statement is what is correct here, not the practice, and it is correct by being in conformity with the practice. The point here is that though practices cannot be correct or incorrect, they are determiners of correctness. Thus a move in chess can be correct by being in conformity with the rules of chess, or a man’s manner of addressing the Queen can be correct by being in conformity with the accepted customs for addressing the Queen. Similarly, certain kinds of statements can be correct (not just grammatically correct, but true) by being in conformity with the rules of English. Thus the statement that some case, C, is a case of an X can be a correct and true statement by being in conformity with the practice of calling Cs ‘X’. (To take a simple example, ‘This color is orange’ can be true and correct by being in line with our practice of calling that color ‘orange’.) And this can be a practice just because it is followed, because the relevant people ordinarily do it.

Thus the paradigm case argument works in part by reminding us of what our linguistic practices are, practices that determine what it is to play the ‘game’ of speaking the relevant language, practices that the skeptic too, in unguarded moments or as a layperson, can be seen to participate in. This, however, is not to say that we should never break the linguistic rules that we currently follow. No prohibition is being urged here on creativity or novelty in the use of language; we are not being urged to never stray from the bounds of conventional and correct speech. The defender only wishes to maintain, against the skeptic, that calling certain things cases of seeing people, calling certain other ones cases of acting freely, and so forth, is not incorrect speech, insofar as it is in conformity with our linguistic customs to do so. Nor is it to deny that those linguistic practices can be criticized as problematic for reasons unrelated to correctness or truth, such as for pragmatic, moral, or political reasons.

9. Conclusion

So, does the paradigm case argument work? There does not seem to be anything intrinsically fallacious about it at least, but this general sort of question is not a good one to ask. First, we have seen that it is problematic to speak of the paradigm case argument, since two versions of it can be distinguished. But more importantly, it may be a bad question to ask because every topic to which it is applied may have its own peculiarities, such that a PCA may work in one application but not in another. For instance, we have seen that with free will skepticism there is a possibility that ‘free will’ is being used in a technical or unusual sense, which would make a PCA type of argument inapplicable to that topic, though nothing similar might be going on with some other topics. Applications of the PCA thus should be judged on a case-by-case basis.

Assessing the influence of the PCA on the analytic philosophical tradition is less easy than it would seem. By one measure, that of observing philosophers explicitly using or referring to the argument and accepting its conclusions, we would have to say that its influence has not been great. However, it is unclear just how much weight we should put on that measure since, as Gilbert Harman said, a ‘philosopher’s acceptance of the paradigm case argument need not be revealed in any explicit statement of the argument, since this acceptance may show itself in the philosopher’s attitude towards skepticism’ (1990, p. 7; also see Gellner 1959, p. 32).

For instance, this acceptance might be manifested in a philosopher’s tendency to treat things commonly or ‘intuitively’ identified as paradigms cases of an X as a datum for the purpose of developing a theory of X (by, for instance, trying to extract necessary or sufficient conditions from the cases), despite the existence of skeptical traditions that deny the existence of Xs. It is not uncommon to see philosophers proceeding in this way (sometimes called ‘the method of cases’) in positive theory development. If pushed to justify this procedure, the philosopher could (but might not) resort to something like the PCA. Skeptics might insist that this philosopher has no right to assume that those ‘paradigm cases’ are genuine paradigms without refuting their skeptical arguments. But defenders can attempt to turn the tables on the skeptics by requesting that they answer these questions. Any skeptical argument against the existence of any X must be based on some conception or analysis, implicit though it may be, of what X is. But how can we know that we have the right conception or analysis of X? Is there a better alternative to using the method of cases? And if not, might depending on the method of cases commit us to non-skepticism about X?

10. References and Further Reading

  • Alexander, H. G. (1958). More about the paradigm-case argument. Analysis, 18(5), pp. 117–120.
  • Ayer, A. J. (1963). Philosophy and language. In The Concept of a Person and Other Essays. London; Basingstoke: Macmillan, pp. 1–35.
  • Beattie, C. (1981). The paradigm case argument: its use and abuse in education. Journal of Philosophy of Education, 15(1), pp. 77–86.
  • Beebee, H. (2013). Free Will: An Introduction. Basingstoke: Palgrave Macmillan.
  • Black, M. (1973). Paradigm cases and evaluative words. Dialectica, 27(1), pp. 261–272.
  • Black, M. (1958). Making something happen. In Determinism and Freedom in the Age of Modern Science (S. Hook, ed.). New York: New York University Press, pp. 31–45.
  • Blanchard, B. (1962). Reason and Analysis. London: George Allen and Unwin Ltd. See chap. 7.
  • Butchvarov, P. (1964). Knowledge of meanings and knowledge of the world. Philosophy, 39(148), pp. 145–160.
  • Campbell, C. A. (1944–45). Common-sense propositions and philosophical paradoxes. Proceedings of the Aristotelian Society, 45, pp. 1–25.
  • Chappell, V. C. (1961). Malcolm on Moore. Mind, 70(279), pp. 417–425.
  • Chisholm, R. (1951). Philosophers and ordinary language. The Philosophical Review, 60(3), pp. 317–328.
  • Christman, J. (2015). Autonomy in moral and political philosophy. Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/autonomy-moral/.
  • Danto, A (1959). The paradigm case argument and the free-will problem. Ethics, 69(2), pp. 120–124.
  • Descartes, R. (2008/1641). Meditations on First Philosophy. Trans. M. Moriarty. Oxford: Oxford University Press.
  • Donnellan, K. (1967). Paradigm-case argument. In The Encyclopedia of Philosophy (P. Edwards, ed.). New York: Macmillan, pp. 106–113.
  • Eveling, H. S. & Leith, G. O. M. (1958). When to use the paradigm-case argument. Analysis, 18(6), pp. 150–152.
  • Flew, A. G. N. (1982). The paradigm case argument: abusing and not using the PCA. Journal of Philosophy of Education, 16(1), pp. 115–121.
  • Flew, A. G. N. (1966). Again the paradigm. In Mind, Matter, and Method: Essays in Philosophy and Science in Honor of Herbert Feigl (P. K. Feyerabend & G. Maxwell, eds.). Minneapolis: University of Minnesota Press, pp. 261–272.
  • Flew, A. G. N. (1957). ‘Farewell to the paradigm-case argument’: a comment. Analysis, 18(2), pp. 34–40.
  • Flew, A. G. N. (1955a). Philosophy and language. The Philosophical Quarterly, 5(18), pp. 21–36.
  • Flew, A. G. N. (1955b). Divine Omnipotence and Human Freedom. In New Essays in Philosophical Theology. Ed. A. Flew and A. MacIntyre. London: SCM, pp. 144–169.
  • Flew, A. G. N. (1954). Crime or disease. The British Journal of Sociology, 5(1), pp. 49–62.
  • Gellner, E. (1959). Words and Things: A Critical Account of Linguistic Philosophy and a Study in Ideology. Great Britain: Victor Gollancz.
  • Hallett, G. L. (2008). Linguistic Philosophy: The Central Story. Albany, N. Y.: State University of New York Press. See chapter 10.
  • Hanfling, O. (1990). What is wrong with the paradigm case argument? Proceedings of the Aristotelian Society, 91, pp. 21–38.
  • Hardie, W. F. R. (1957). My own free will. Philosophy, 32(120), pp. 21–38.
  • Harman, G. (1990). Skepticism and the Definition of Knowledge. London; New York: Routledge. See chapter 1.
  • Harre, R. (1958). Tautologies and the paradigm-case argument. Analysis, 18(4), pp. 94–96.
  • Heath, P. L. (1952). The appeal to ordinary language. The Philosophical Quarterly, 2(6), pp. 1–12.
  • Houlgate, L. D. (1962). The paradigm-case argument and ‘possible doubt’. Inquiry, 5(1–4), pp. 318–324.
  • Jackson, F. (1998). From Metaphysics to Ethics: A Defence of Conceptual Analysis. Oxford: Clarendon Press.
  • King-Farlow, J. & Rothstein, J. M. (1964). Paradigm cases and the injustice to Thrasymachus. The Philosophical Quarterly, 14(54), pp. 15–22.
  • Lucas, J. R. (1970). The Freedom of the Will. Oxford: Oxford University Press.
  • MacIntyre, A. C. (1957). Determinism. Mind, 66(261), pp. 28–41.
  • McKenna, M. and Pereboom, D. 2016. Free Will: A Contemporary Introduction. New York; London: Routledge.
  • Malcolm, N. (1963). George Edward Moore. In Knowledge and Certainty: Essays and Lectures by Norman Malcolm. Englewood Cliffs, N. J.: Prentice-Hall, Inc, pp. 163–183.
  • Malcolm, N. (1951). Philosophy for philosophers. The Philosophical Review 60(3), pp. 329–340.
  • Malcolm, N. (1992/1942). Moore and ordinary language. In The Linguistic Turn (R. Rorty, ed.). Chicago and London: The University of Chicago Press (pp. 111–124). Originally published in (1942) The Philosophy of G. E. Moore (Paul A. Schilpp, ed.). Evanston and Chicago: Northwestern University Press (pp. 345–368).
  • Malcolm, N. (1940). Are necessary propositions really verbal? Mind, 194, pp. 189–203.
  • Marconi, D. (2009). Being and being called: paradigm case arguments and natural kind words. The Journal of Philosophy, 106(3), pp. 113–136.
  • Monroe, A. E. & Malle, B. F. (2010). From uncaused will to conscious choice: the need to study, not speculate about people’s folk concept of free will. Review of Philosophy and Psychology, 1(2), pp. 211–224.
  • Moore, G. E. (1939). Proof of an external world. Proceedings of the British Academy, 25, pp. 273–300.
  • Moore, G. E. (1918). The conception of reality. Proceedings of the Aristotelian Society, 18(1), pp. 101–120.
  • Papineau, D. (1998). Methodology: the elements of the philosophy of science. In Philosophy 1: A Guide Through the Subject (A. C. Grayling ed.). Oxford: Oxford University Press, pp. 123–180.
  • Passmore, J. (1961). Philosophical Reasoning. London: Duckworth. See chapter 6.
  • Parker-Ryan, S. (2010). Reconsidering ordinary language philosophy: Malcolm’s (Moore’s) ordinary language argument. Essays in Philosophy, 11(2), pp. 123–149.
  • Piper, M. (2010). Autonomy: normative. Internet Encyclopedia of Philosophy.
  • Richman, R. J. (1962). Still more on the argument of the paradigm case. Australasian Journal of Philosophy, 40(2), pp. 204–207.
  • Richman, R. J. (1961). On the argument of the paradigm case. Australasian Journal of Philosophy, 39(1), pp. 75–81.
  • Soames, S. (2003). Philosophical Analysis in the Twentieth Century, vol. 2. Princeton, New Jersey: Princeton University Press. See chapter 7.
  • Stebbing, S. (1937). Philosophy and the Physicists. London: Methuen.
  • Stroud, B. (1984). The Significance of Philosophical Scepticism. Oxford; New York: Oxford University Press. See chapter 2.
  • Urmson, J. O. (1953). Some questions concerning validity. Revue Internationale de Philosophie, 7(25), pp. 217–229.
  • van Inwagen, P. (2008). How to think about the problem of free will. The Journal of Ethics, 12(3/4), pp. 327–341.
  • van Inwagen, P. (1983). An Essay on Free Will. Oxford: Clarendon Press. See chapter 4.
  • Watkins, J. W. N. (1957a). Farewell to the paradigm-case argument. Analysis, 18(2), pp. 25–33.
  • Watkins, J. W. N. (1957b). A reply to professor Flew’s comment. Analysis, 18(2), pp. 41–42.
  • Williams, C. J. F. (1961). More on the argument of the paradigm case. Australasian Journal of Philosophy, 39(3), pp. 276–278.
  • Wittgenstein, L. (2009/1953). Philosophical Investigations. Trans. G. E. M. Anscombe, P. M. S. Hacker & Joachim Schulte. Chichester: Wiley-Blackwell.

Author Information

Kevin Lynch
Email: kevinlynch405@eircom.net
Huaqiao University
China

Duality in Logic and Language

Duality phenomena occur in nearly all mathematically formalized disciplines, such as algebra, geometry, logic and natural language semantics. However, many of these disciplines use the term ‘duality’ in vastly different senses, and while some of these senses are intimately connected to each other, others seem to be entirely unrelated. Consequently, if the term ‘duality’ is used in two different senses in one and the same work, the authors often explicitly warn about the potential confusion.

This article focuses exclusively on duality phenomena involving the interaction between an ‘external’ and an ‘internal’ negation of some kind, which arise primarily in logic and linguistics. A well-known example from logic is the duality between conjunction and disjunction in classical propositional logic: \varphi \wedge \psi is logically equivalent to \neg (\neg \varphi \vee \neg \psi), and hence \neg ( \varphi \wedge \psi) is logically equivalent to \neg \varphi \vee \neg \psi. A well-known example from linguistics concerns the duality between the aspectual particles already and still in natural language: already outside means the same as not still inside, and hence, not already outside means the same as still inside (where inside is taken to be synonymous with not outside). Examples such as these show that dualities based on external/internal negation show up for a wide variety of logical and linguistic operators.

Duality phenomena of this kind are highly important. First of all, since they occur in formal as well as natural languages, they provide an interesting perspective on the interface between logic and linguistics. Furthermore, because of their ubiquity across natural languages, it has been suggested that duality is a semantic universal, which can be of great heuristic value. Finally, duality principles play a central role in Freudenthal’s famous proposal for a language for cosmic communication.

Many authors employ the notion of duality as a means to describe the specific details of a particular formal or natural language, without going into any systematic theorizing about this notion itself. Next to such auxiliary uses, however, there also exist more abstract, theoretical accounts that focus on the notion of duality itself. For example, these theoretical perspectives address the group-theoretical aspects of duality, or its interplay with the so-called Aristotelian relations. This article examines a wide variety of dualities in formal and natural languages, and it discusses some of the more theoretical perspectives on duality.

The article is organized as follows. Sections 1 and 2 provide an extensive overview of the most important concrete examples of duality in logic and natural language. Section 3 describes a detailed framework (based on the notion of a Boolean algebra) that allows systematical analysis of these dualities. Section 4 presents a group-theoretical approach to duality phenomena, and Section 5 draws an extensive comparison between duality relations and another type of logical relation, namely those that characterize the Aristotelian square of opposition.

As to the technical prerequisites for this article, Sections 1 and 2 should be accessible to everyone with a basic understanding of philosophical logic. In Sections 3, 4 and 5, the use of some other mathematical tools and techniques is unavoidable; these sections require a basic understanding of discrete mathematics (in particular, Boolean algebra and elementary group theory).

Table of Contents

  1. Duality in Logic
  2. Duality in Natural Language
  3. Theoretical Framework
  4. A Group-Theoretical Approach to Duality
  5. Duality Relations and Aristotelian Relations
  6. References and Further Reading

1. Duality in Logic

Conjunction and disjunction. The most widely known example of duality in logic is undoubtedly that between conjunction and disjunction in classical propositional logic (\mathsf{CPL}). Because of their semantics, i.e. the way they are standardly interpreted in \mathsf{CPL}, these connectives can be defined in terms of each other, and consequently, only one of them needs to be taken as primitive. For example, if conjunction (\wedge) and negation (\neg) are taken as primitives, then disjunction (\vee) can be defined as follows:

(1)   \begin{equation*}\varphi\vee\psi :\equiv \neg(\neg\varphi\wedge\neg\psi).\end{equation*}

Alternatively, if disjunction is taken as primitive, then conjunction can be defined as follows:

(2)   \begin{equation*}\varphi\wedge\psi :\equiv \neg(\neg\varphi\vee\neg\psi).\end{equation*}

Furthermore, each of these equivalences can be derived from the other one; for example, if (1) is taken as primitive, then we obtain (2) as follows:

(3)   \begin{equation*}  \neg(\neg\varphi\vee\neg\psi) \equiv \neg\neg(\neg\neg\varphi\wedge\neg\neg\psi) \equiv \varphi\wedge\psi. \end{equation*}

Finally, in both cases we obtain the well-known laws of De Morgan. For example, if conjunction is taken as primitive, then (4) follows immediately from (1), while (5) follows from (1) via (3):

(4)   \begin{equation*} \neg(\varphi\vee\psi) & \equiv & \neg\varphi\wedge\neg\psi \end{equation*}

(5)   \begin{equation*} \neg(\varphi\wedge\psi) & \equiv & \neg\varphi\vee\neg\psi. \end{equation*}

Equivalences such as (15) exhibit the duality between conjunction and disjunction. They clearly show the interaction between an internal negation (which attaches to each of the individual formulas \varphi and \psi, and thus occurs inside the scope of the conjunction/disjunction connective) and an external negation (which occurs outside the scope of the connectives). Equivalences (12) show that applying both internal and external negation to a disjunction yields the corresponding conjunction, and vice versa. Similarly, (45) show that the internal negation of a disjunction is logically equivalent to the external negation of the corresponding conjunction, and vice versa. All these equivalences are manifestations of the underlying semantics of the conjunction and disjunction connectives in \mathsf{CPL}.

Universal and existential quantifiers. Another well-known case of duality concerns the universal and existential quantifiers in classical first-order logic (\mathsf{FOL}). The situation here is largely analogous to that of conjunction and disjunction. Because of their semantics, i.e. the way they are standardly interpreted in \mathsf{FOL}, these quantifiers can be defined in terms of each other, and consequently, only one of them needs to be taken as primitive. For example, if the universal quantifier (\forall) is taken as primitive, the existential quanifier (\exists) can be defined as follows:

(6)   \begin{equation*}\exists x\varphi :\equiv \neg\forall x\neg\varphi.\end{equation*}

Conversely, if the existential quantifier is taken as primitive, then the universal quantifier can be defined as follows:

(7)   \begin{equation*}\forall x\varphi :\equiv \neg\exists x\neg\varphi.\end{equation*}

Again, each of these equivalences can be derived from the other one; for example, if (6) is taken as primitive, then we obtain (7) as follows:

(8)   \begin{equation*}  \neg\exists x\neg\varphi \equiv \neg\neg\forall x\neg\neg\varphi\equiv \forall x\varphi. \end{equation*}

Finally, in both cases we obtain the well-known quantifier laws. For example, if the universal quantifier is taken as primitive, then (9) follows immediately from (6), while (10) follows from (6) via (8):

(9)   \begin{equation*}\neg\exists x \varphi & \equiv & \forall x\neg\varphi,\end{equation*}

(10)   \begin{equation*}\neg\forall x\varphi & \equiv & \exists x\neg\varphi\end{equation*}

Equivalences such as (610) exhibit the duality between the universal and the existential quantifier. Again, they show the interaction between an internal negation (which occurs inside the scope of the quantifier) and an external negation (which occurs outside the scope of the quantifier). Equivalences (67) show that applying both internal and external negation to an existential quantifier yields the corresponding universal quantifier, and vice versa. Similarly, (910) show that the internal negation of an existential quantifier is logically equivalent to the external negation of the corresponding universal quantifier, and vice versa. All these equivalences are manifestations of the underlying semantics of the universal and existential quantifiers in \mathsf{FOL}.

Modal operators. Another rich source of dualities is the broad family of modal logics. For example, in alethic modal logic, necessity (\Box) and possibility (\Diamond) are dual to each other (1112), while in deontic logic, obligation (O) and permission (P) are usually taken as duals (1314):

(11)   \begin{equation*}\Box\varphi \equiv \neg\Diamond\neg\varphi, \hspace{0.3cm} & \hspace{0.3cm} \neg\Box\varphi \equiv \Diamond\neg\varphi\end{equation*}

(12)   \begin{equation*}\Diamond\varphi \equiv \neg\Box\neg\varphi, \hspace{0.3cm} & \hspace{0.3cm} \neg\Diamond\varphi \equiv \Box\neg\varphi,\end{equation*}

(13)   \begin{equation*}O\varphi \equiv \neg P \neg\varphi, \hspace{0.3cm} & \hspace{0.3cm} \neg  O  \varphi \equiv P \neg\varphi,\end{equation*}

(14)   \begin{equation*}P \varphi \equiv \neg O \neg\varphi, \hspace{0.3cm} & \hspace{0.3cm} \neg P \varphi \equiv O \neg\varphi.\end{equation*}

Blackburn et al. (2001) provide many other modal examples from concrete application domains, such as temporal logic, propositional dynamic logic and hybrid logic, and more mathematically motivated examples, such as the dualities involving the difference modality and the universal modality. In general, an n-ary modal operator is called a  triangle (\Delta), and its dual a nabla (\nabla):

(15)   \begin{equation*}\Delta(\varphi_1,\dots,\varphi_n) &\equiv & \neg\nabla(\neg\varphi_1,\dots,\neg\varphi_n),\end{equation*}

(16)   \begin{equation*}\nabla(\varphi_1,\dots,\varphi_n) &\equiv & \neg\Delta(\neg\varphi_1,\dots,\neg\varphi_n).\end{equation*}

The equivalences (1516) again clearly illustrate the interaction between internal and external negation. Note, furthermore, that the internal negation is applied to all formulas (\varphi_1,\dots,\varphi_n). This was also the case with conjunction/disjuction (12) and with the universal/existential quantifiers (67) (although the latter case is trivial, since in equivalences (67) there is only a single formula (\varphi) to which the internal negation can be applied).

Interconnections. Many of the examples given above are systematically related to each other, and might thus be viewed as manifestations of the same underlying duality. First of all, it is well-known that the propositional connectives of conjunction and disjunction are related to the universal and existential quantifiers, respectively. For example, the formulas \forall x Px and \exists x Px can informally be viewed as expressing the conjunction Pa \wedge Pb \wedge Pc \wedge \dots and the disjunction Pa \vee Pb \vee Pc \vee \dots, respectively.  This reveals a structural similarity between equivalences (12) and (67). Secondly, in Kripke semantics the modal operators are interpreted as quantifying over possible worlds. For example, the formulas \Box p and \Diamond p can be interpreted as stating that p is true in all possible worlds and that p is true in at least one possible world, respectively. This reveals a structural similarity between equivalences (67) and (1112).

2. Duality in Natural Language

Quantifiers and modalities in natural language. The most obvious class of natural language expressions that give rise to duality behavior, are the immediate counterparts of the logical operators discussed in Section 1. For example, the determiners all and some combine with a noun to yield noun phrases such as all books and some books, and seem to correspond directly to the quantifiers \forall and \exists. This correspondence is not entirely unproblematic, since it ignores linguistically relevant distinctions, such as the difference between every and all vis-à-vis collective and distributive predicates (Dowty 1987; Brisson 2003), and the distinction between quantificational and non-quantificational uses of some (Löbner 1987). Setting such considerations aside, however, one can say that the natural language determiners all and some are each other’s duals, just like the first-order quantifiers \forall and \exists are each other’s duals. Similarly, the duality relation between \Box and \Diamond in modal logic also shows up for a whole range of natural language expressions for necessity and possibility. In logic, \Box and \Diamond are almost invariably operators taking propositions as their arguments. In natural language, however, the modal notions are expressed in a variety of linguistic categories, such as modal adjectives (necessary vs. possible), modal adverbs (necessarily vs. possibly) or modal auxiliary verbs (must/should vs. can/may).

Conjunction and disjunction in natural language. The most prototypical duality in logic, namely that between the propositional connectives of conjunction and disjunction, only plays a minor role, if any, in the linguistic realm. The main reason is the ambiguity of natural language and and or, which is often explained pragmatically in terms of conversational implicatures (Horn 2004). For example, natural language conjunction very often conveys additional aspects of causality (\varphi and \psi \equiv \varphi and therefore \psi) or sequentiality (\varphi and \psi \equiv \varphi and afterwards \psi), whereas disjunction is notoriously ambiguous between an inclusive interpretation (\varphi or \psi \equiv \varphi or \psi, and perhaps both) and an exclusive interpretation (\varphi or \psi \equiv \varphi or \psi, but not both). These asymmetrical ambiguities of natural language conjunction and disjunction render the notion of duality less suitable for their linguistic and philosophical analysis, as observed by Humberstone (2011, p. 772):

for many logical purposes [\ldots] conjunction and disjunction are attractively treated in a symmetrical fashion. Inherent asymmetries in the informal conceptual apparatus we bring to bear on logic often make duality an inappropriate consideration to bring in for philosophical purposes, however.

Testing for duality. In logic, duality is a matter of definition or convention; in modal logic, for example, the duality between \Box and \Diamond follows from the way in which the semantics of these operators is defined. By contrast, in linguistics, duality is a much more empirical matter. In other words, duality relations between natural language expressions have to be argued for or demonstrated and may thus be refuted on empirical grounds. For that purpose, duality tests have been devised (Löbner 2011, p. 492ff.), which crucially rely on the relation of lexical inversion holding between predicates such as be on/off, be inside/outside or be here/gone. Testing for internal negation evaluates the equivalence between (i) a proposition O(P) with operator O and predicate P and (ii) a proposition O'(P'), with operator O' = \Tiny{INEG} \small{(O)} being the internal negation of O, and predicate P' = \Tiny{LEXINV} \small(P) being the lexical inverse of P; see (17). The examples in (1819) illustrate the internal negations of the quantifiers:

(17)   \begin{equation*}O(P) & \equiv & \Tiny{INEG}\small(O)(\Tiny{LEXINV}\small{(P)})\end{equation*}

(18)   \begin{equation*} \textbf{Some}~ lights~ are~ \textbf{on}. &\equiv & \textbf{Not all}~ lights~ are~ \textbf{off.}\end{equation*}

(19)   \begin{equation*}\textbf{No}~ children~ are~ \textbf{inside.} &\equiv & \textbf{All}~ children~ are~ \textbf{outside.}\end{equation*}

Testing for duality evaluates the equivalence between (i) a proposition which gives a negative answer to a polarity question of the form O(P) and (ii) a proposition O'(P'), with operator O' = \Tiny{DUAL}\small{(O)} being the dual of O, and predicate P' = \Tiny{LEXINV}\small{(P)} again being the lexical inverse of P; see (20). The examples in (2122) illustrate the dialogue patterns establishing the duality of the universal and existential quantifiers:

(20)   \begin{equation*} \neg O(P) & \equiv & \Tiny{DUAL}\small{(O)}(\Tiny{LEXINV}\small{(P)})\end{equation*}

(21)   \begin{equation*} Are~ \textbf{some}~ lights~ \textbf{on}? - No, & \equiv & \textbf{all}~ lights~ are~ \textbf{off}.\end{equation*}

(22)   \begin{equation*} Are~ \textbf{all}~ children~ \textbf{inside}? - No, & \equiv & \textbf{some}~ children~ are~ outside.\end{equation*}

The main reason for applying lexical inversion to the predicates in these tests, rather than straightforward grammatical negation by means of the negative particle not, is that the latter may yield scope ambiguities, depending on whether it is taken to express internal or external negation (Löbner 2011, p. 492ff.). For example, the negative particle not in the lefthand side of (2324) may get the internal negation reading (23) as well as the external negation reading (24). Similarly, the modal auxiliary may in the lefthand side of (2526) interacts differently with the negative particle not depending on the type of modality involved: in its epistemic use, it gets the internal negation reading (25), whereas in its deontic use, it gets the external negation reading (26).

(23)   \begin{equation*} \textbf{All}~ children~ are~ \textbf{not}~ inside. & \stackrel{1}{\equiv} & \textbf{All}~ children~ are~ \textbf{outside}.\end{equation*}

(24)   \begin{equation*} &\stackrel{2}{\equiv}& \textbf{Not all}~ children~ are~ \textbf{inside}.\end{equation*}

(25)   \begin{equation*}She~ \textbf{may not}~ stay. & \stackrel{1}{\equiv} & She~ \textbf{may}~ leave.\end{equation*}

(26)   \begin{equation*} & \stackrel{2}{\equiv} & She~ \textbf{must}~ leave.\end{equation*}

The negative particle not and the quantifier all in the lefthand side of (2324) can take scope over each other: in (23), not occurs inside the scope of all (thereby transforming the predicate inside into its lexical inverse outside), while in (24), all occurs inside the scope of not. Such scope ambiguities also arise for other operators besides negation. For example, the quantifier all and the modal adverb necessarily in (2728) can take scope over each other, thus giving rise to the de dicto reading (27) and the de re reading (28). However, scope distinctions cannot be fully reduced to the de dicto/de re distinction. After all, the latter is a binary distinction, whereas operators that take scope over each other can give rise to more than two distinct interpretations (Kripke 1977).

(27)   \begin{equation*} \textbf{Everything}~ is~ \textbf{necessarily}~ self\textrm{-}identical. & \stackrel{1}{\equiv} & \Box\forall x(x = x),\end{equation*}

(28)   \begin{equation*} & \stackrel{2}{\equiv}& \forall x\Box(x=x).\end{equation*}

Another complication arising from negation concerns the cognitive difficulty that people have with processing sentences that contain multiple negations. Because of these cognitive difficulties, some of the tests described above are less easily applicable to determine whether a certain relation holds between two expressions. For example, we not only have a duality between the positive quantifiers all and some, but also one between the negative quantifiers no and some not. The former duality is empirically confirmed by the dialogue patterns in (2122). In contrast, the corresponding dialogue patterns for the latter duality in (2930) contain three grammatical negations (no, no and not) and one lexical inversion (off), and therefore sound much less natural (even though they are logically impeccable).

(29)   \begin{equation*} Are~ \textbf{no}~ lights~ \textbf{on}?- No, &\equiv& \textbf{not all}~ lights~ are~ \textbf{off}.\end{equation*}

(30)   \begin{equation*} Are~ \textbf{not all}~ lights~ \textbf{on}?- No, &\equiv& \textbf{no}~ lights~ are~ \textbf{off}.\end{equation*}

Pronouns and adverbs of quantification. The universal and existential quantifiers are not only related to the determiners all and some, but also to a number of other linguistic categories. For example, when quantifying over people or objects, the determiners are morphologically integrated with the nouns body and thing into indefinite pronouns. Similarly, when quantifying over places, the determiners are morphologically integrated with the adverb where into compound adverbs. By contrast, adverbs that quantify over time and manner exhibit more idiosyncratic lexicalization patterns. Irrespective of such morphological details, all of the categories in the table below inherit the same basic duality pattern from the determiners, and thus, ultimately, from the logical quantifiers \forall and \exists.

\forall \neg\forall \forall\neg \neg\forall\neg
\neg\exists\neg \exists\neg \neg\exists \exists
every not every no some
everybody not everybody nobody somebody
everything not everything nothing something
everywhere not everywhere nowhere somewhere
always not always never sometimes
anyhow not anyhow no way somehow

Generalized quantifiers. Contemporary generalized quantifier theory (GQT) is able to deal with a considerably larger range of natural language quantifiers than the usual universal and existential ones (Barwise and Cooper 1981; Peters and Westerståhl 2006). These include quantifiers that cannot be expressed in first-order languages, such as most. Additionally, GQT allows for a more compositional treatment of quantification. Consider, for example, the sentences John runs and everybody runs, which have by and large the same syntactic structure (namely: noun phrase + verb phrase). While the first-order representations of the semantics of these sentences are vastly different–\textit{run}(j) vs. \forall x\colon \textit{run}(x)–, their GQT representations are much more similar: \textit{John}(\textit{run}) vs. \textit{everybody}(\textit{run}).

GQT offers two (mathematically equivalent) perspectives on quantification: a functional and a relational perspective. Focusing on the former, a quantifier expression Q is taken to denote a set of subsets of the universe U of people, and for any unary predicate expression B, the formula Q(B) is true iff [\![B]\!] \in [\![Q]\!]. For example, since

    \[[\![\textit{everybody}]\!] = \{X \subseteq U \mid U = X\}\]

and

    \[[\![\textit{somebody}]\!] = \{X \subseteq U \mid X \neq \emptyset\}\]

it is easy to see that \textit{everybody}(\textit{run}) is true iff U = [\![\textit{run}]\!] and that \textit{somebody}(\textit{run}) is true iff [\![\textit{run}]\!] \neq \emptyset. As expected, the external negation, internal negation and dual of the formula Q(B) are defined as \neg Q(B), Q(\neg B) and \neg Q(\neg B), respectively (with the convention that [\![\neg B]\!] = U - [\![B]\!] = \{x \in U \mid x \notin [\![B]\!]\}). For example, the dual of \textit{everybody}(\textit{run}) is \neg \textit{everybody} (\neg \textit{run}), which is true iff U \neq U - [\![\textit{run}]\!], i.e. iff [\![\textit{run}]\!]\neq \emptyset. This shows that in GQT, too, the dual of \textit{everybody}(\textit{run}) is \textit{somebody}(\textit{run}). Finally, if the proper name John names the individual j \in U, then GQT defines the generalized quantifier

    \[[\![\textit{John}]\!] = \{X \subseteq U \mid j \in X\}\]

and thus we find that \textit{John}(\textit{run}) is true iff [\![\textit{run}]\!] \in [\![\textit{John}]\!], iff j \in [\![\textit{run}]\!]. Note that the dual of \textit{John}(\textit{run}) is \neg \textit{John}(\neg \textit{run}), which is true iff j \notin U - [\![\textit{run}]\!], iff j\in [\![\textit{run}]\!]. This shows that \textit{John}(\textit{run}) is dual to itself, which illustrates the fact that in GQT, proper names are self-dual (Gamut 1991, p. 238)).

We now turn to the alternative, relational perspective in GQT. This perspective focuses on sentences of the form Q(A,B), where Q is a quantifier expression and A and B are unary predicate expressions. The formula Q(A,B) is true iff ([\![A]\!],[\![B]\!]) \in [\![Q]\!]. Here are some well-known examples (with \wp(U) denoting the powerset of U, i.e. \wp(U) = \{X \mid X \subseteq U\}):

(31)   \begin{align*}{r c l} [\![\textit{all}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid X \subseteq Y\} \\ [\![\textit{some}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid X \cap Y\neq \emptyset\} \\ [\![\textit{most}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid |X \cap Y| > |X - Y|\} \\ [\![\textit{some but not all}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid X \cap Y \neq \emptyset \text{ and } X - Y \neq \emptyset\} \\ [\![\textit{exactly half of the}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid |X \cap Y| = \frac{1}{2}|X|\} \\ [\![\textit{the}_{\text{sing}}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid |X| = 1 \text{ and } X \subseteq Y\} \\ \end{align*}

The external negation, internal negation and dual of the formula Q(A,B) are defined as \neg Q(A,B), Q(A,\neg B) and \neg Q(A,\neg B), respectively. Note that, in contrast to the examples from logic discussed in Section 1, internal negation is not applied to all predicate expressions, but only to the second one. Here, too, generalized quantifiers can be their own dual or internal negation. For example, the internal negation of some but not all (man, run) is some but not all (man, \negrun), which is true iff

    \[[\![man]\!] \cap (U - [\![run]\!]) \neq \emptyset \text{ and } [\![man]\!] - (U - [\![run]\!]) \neq \emptyset\]

iff

    \[[\![man]\!] - [\![run]\!] \neq \emptyset \text{ and } [\![man]\!] \cap [\![run]\!] \neq \emptyset\]

iff some but not all (man,run) is true. This shows that some but not all is its own internal negation. Similarly, the proportional quantifier exactly half of the can be shown to be its own internal negation; for example, exactly half of the men are awake is equivalent to exactly half of the men are not awake.

The duality patterns of quantifiers such as most and many have been a matter of contention. Peterson (1979) proposed an analysis from which it follows that most and many are dual to each other. However, as pointed out by Horn (2006, p. 36), it seems unlikely that most(A, B) is in general equivalent to \neg many(A,\neg B). Consider, for example:

(32)   \begin{equation*} \textit{Most Italians like pizza.}\end{equation*}

(33)   \begin{equation*} \textit{Not many Italians do not like pizza.}\end{equation*}

(34)   \begin{equation*} \textit{Many Italians do not like pizza.}\end{equation*}

If most and many were indeed dual, then (32) and (33) should be equivalent, while (32) and (34) should be contradictory. However, (32) is true, but, since there are indeed many Italians that do not like pizza, (33) is false and (34) is true. This shows that (32) and (33) are not equivalent, and that (32) and (34) are not contradictory either.

Other linguistic expressions. Duality patterns also arise among natural language expressions that do not directly correspond to logical operators or quantifiers. For example, König (1991) has suggested that the causative conjunction because and the concessive conjunction although are duals, based on dialogue tests for duality such as (35).

(35)   \begin{equation*} p~ \textbf{because}~ q? - No, & \equiv & p~ \textbf{although}~ \neg q.\end{equation*}

However, based on other linguistic evidence and more general, methodological considerations, this proposal has been criticized by Iten (1998, 2005). Working in the framework of relevance theory, Iten argues that causative conjunctions make a significant contribution to the truth conditions of sentences in which they occur: p because q is true iff q is true, p is true, and q‘s being true is the cause of p‘s being true. By contrast, concessive conjunctions do not contribute to the truth conditions of sentences in which they occur: p although q is true iff q is true and p is true. Because of this discrepancy, Iten claims that sentences such as \neg(p because q) and p although \negq do not have the same truth conditions, and consequently, because and although are not dual to each other.

The most widely studied example of linguistic duality, however, is that between the aspectual adverbs already and still (Löbner 1989, 1990, 1999; van der Auwera 1993; Mittwoch 1993; Michaelis 1996; Smessaert and ter Meulen 2004). The dialogue tests for duality in (3637) suggest that already and still are indeed each other’s duals.

(36)   \begin{equation*} Is~ Bob~ \textbf{already}~ \textbf{outside}? - No, & \equiv & he~ is~ \textbf{still}~ \textbf{inside}.\end{equation*}

(37)   \begin{equation*} Is~ Bob~ \textbf{still}~ \textbf{outside}? - No, & \equiv & he~ is~ \textbf{already}~ \textbf{inside}.\end{equation*}

Similarly, using the equivalence tests for internal negation in (3839), we find that the internal negation of already is no longer and that of still is not yet. Finally, the equivalences in (4041) show that the external negation of already is not yet and that of still is no longer.

(38)   \begin{equation*} Bob~ is~ \textbf{already}~ \textbf{outside}. & \equiv & Bob~ is~ \textbf{no longer}~ \textbf{inside}.\end{equation*}

(39)   \begin{equation*} Bob~ is~ \textbf{still}~ \textbf{outside}. & \equiv &  Bob~ is~ \textbf{not yet}~ \textbf{inside}.\end{equation*}

(40)   \begin{equation*} It's~ not~ the~ case~ that~ Bob~ is~ \textbf{already}~ \textbf{outside}. & \equiv & Bob~ is~ \textbf{not yet}~ \textbf{outside}.\end{equation*}

(41)   \begin{equation*} It's~ not~ the~ case~ that~ Bob~ is~ \textbf{still}~ \textbf{outside}. & \equiv & Bob~ is~ \textbf{no longer}~ \textbf{outside}\end{equation*}

The two negative adverbs no longer and not yet are also dual to each other, as illustrated by the dialogues in (4243). However, because of the multiple negative elements, these dialogues sound less natural than the ones in (3637), even though all of them are equally logically correct (compare with the dialogues in (2122) and (2930) for the dualities between the standard quantifiers).

(42)   \begin{equation*} Is~ Bob~ \textbf{not yet}~ \textbf{outside}? - No, & \equiv & he~ is~ \textbf{no longer}~ \textbf{inside}.\end{equation*}

(43)   \begin{equation*} Is~ Bob~ \textbf{no longer}~ \textbf{outside}? - No, & \equiv & he~ is ~\textbf{not yet}~ \textbf{inside}.\end{equation*}

Phase quantification. In order to account for the duality patterns of the aspectual adverbs described in (3643), Löbner (1989; 1990; 2011) has developed the theory of phase quantification. He considers a (linear) temporal scale, a reference time t on that scale, and a proposition p (which is either true or false at any timepoint of the scale). The semantics of aspectual adverbs crucially concerns single polarity transitions on this temporal scale. There are two types of such transitions: the truth value of p can change from false into true, or alternatively, from true into false. Furthermore, the reference time t can either be situated in the positive (p) phase or in the negative (\neg p) phase of such a transition. In total, there are thus four cases to be distinguished:

    • t is in the positive phase of a polarity transition from falsity to truth

As illustrated in Figure 1(a), this corresponds to sentences such as Bob was already reading the paper at noon. The reference time (at noon) is situated in the positive phase (in which Bob was reading the paper), and thus occurs after the (actual) transition of starting to read (i.e. the transition from not reading to reading) has taken place.

Figure 1: Löbner’s Four Phase Diagrams

    • t is in the positive phase of a polarity transition from truth to falsity

As illustrated in Figure 1(b), this corresponds to sentences such as Bob was still reading the paper at noon. The reference time (at noon) is situated in the positive phase (in which Bob was reading the paper), and thus occurs before the (potential) transition of stopping to read (i.e. the transition from reading to not reading) has taken place.

    • t is in the negative phase of a polarity transition from falsity to truth

As illustrated in Figure 1(c), this corresponds to sentences such as Bob was not yet reading the paper at noon. The reference time (at noon) is situated in the negative phase (in which Bob was not reading the paper), and thus occurs before the (potential) transition of starting to read (i.e. the transition from not reading to reading) has taken place.

    • t is in the negative phase of a polarity transition from truth to falsity

As illustrated in Figure 1(d), this corresponds to sentences such as Bob was no longer reading the paper at noon. The reference time (at noon) is situated in the negative phase (in which Bob was not reading the paper), and thus occurs after the (actual) transition of stopping to read (i.e. the transition from reading to not reading) has taken place.

In the case of duality (already/still and not yet/no longer), the actual polarity of p thus remains unchanged, but the direction of the polarity transition gets reversed. By contrast, in the case of external negation (not yet/already and still/no longer) the actual polarity of p is switched, but the polarity transition remains unchanged. Finally, in the case of internal negation (not yet/still and already/no longer), both the actual polarity of p and the direction of the polarity transition are reversed. This shows that in the phase quantification analysis, internal negation is viewed as the combination of duality and external negation. Löbner has also used this analysis to account for asymmetries in lexicalization patterns: already and still are less marked than not yet, which in turn is less marked than no longer (also see Section 5). Finally, it should also be emphasized that this analysis has been generalized to other lexical domains besides the aspectual adverbs, such as scalar predicates and (the procedural interpretation of) the first-order quantifiers.

Language universals and universal languages. The overview presented in this section shows that duality phenomena are not only ubiquitous in formal logical languages, but also in natural languages. It has therefore been suggested that duality is a semantic universal, which can be of great heuristic value in comparative linguistic research (van Benthem 1991). Furthermore, duality also plays a central role in artificial languages, which can be viewed as occupying an intermediate position between formal and natural languages. For example, Lincos, which was developed by Freudenthal (1960) for the purpose of cosmic communication, contains duality principles for conjunction/disjunction (1.36.8), universal/existential quantification (1.36.9), necessity/possibility (3.25.1) and obligation/permission (3.32.3).

3. Theoretical Framework

General definition. We will now present a general theoretical framework in which duality phenomena can be described and analyzed. Consider Boolean algebras

    \[\mathbb{A} = \langle A, \wedge_\mathbb{A}, \vee_\mathbb{A}, \neg_\mathbb{A}, \top_\mathbb{A}, \bot_\mathbb{A}\rangle\]

and

    \[\mathbb{B} = \langle B, \wedge_\mathbb{B}, \vee_\mathbb{B}, \neg_\mathbb{B}, \top_\mathbb{B}, \bot_\mathbb{B}\rangle\]

(Givant and Halmos 2009), and consider n-ary operators O_1, O_2\colon\mathbb{A}^n \to \mathbb{B}. The duality relations are defined as follows: O_1 and O_2 are

  • identical – abbreviated as \Tiny{ID}\small{(O_1, O_2)} – iff
  • \forall a_1,\dots,a_n \!\in\! A\!: O_1(a_1,\dots,a_n) = O_2(a_1,\dots,a_n),

  • each other’s external negation – abbreviated as \Tiny{ENEG}\small{(O_1, O_2)} – iff
  • \forall a_1,\dots,a_n \!\in\! A\!: O_1(a_1,\dots,a_n) = \neg_\mathbb{B}O_2(a_1,\dots,a_n),

  • each other’s internal negation – abbreviated as \Tiny{INEG}\small{(O_1, O_2)} – iff
  • \forall a_1,\dots,a_n \!\in\! A\!: O_1(a_1,\dots,a_n) = O_2(\neg_\mathbb{A}a_1,\dots,\neg_\mathbb{A}a_n),

  • each other’s dual – abbreviated as \Tiny{DUAL}\small{(O_1, O_2)} – iff
  • \forall a_1,\dots,a_n \!\in\! A\!: O_1(a_1,\dots,a_n) = \neg_\mathbb{B}O_2(\neg_\mathbb{A}a_1,\dots,\neg_\mathbb{A}a_n).

Special cases. The definition provided above is fully abstract and general, but by plugging in concrete Boolean algebras for \mathbb{A} and \mathbb{B}, we can recover the usual dualities as special cases. For example, in the language \mathcal{L}_\mathsf{CPL} of classical propositional logic (\mathsf{CPL}), we can define equivalence classes

    \[[\varphi] := \{\psi \in\mathcal{L}_\mathsf{CPL} \mid \varphi \equiv \psi\}\]

and consider the Lindenbaum-Tarski algebra

    \[\mathbb{B}_\mathsf{CPL} := \{[\varphi] \mid \varphi \in \mathcal{L}_\mathsf{CPL}\}\]

It is well-known that \mathbb{B}_\mathsf{CPL} is a Boolean algebra, and can thus be plugged in for \mathbb{A} and/or \mathbb{B} in the aforementioned definition. For example, if we consider conjunction and disjunction as binary operators

    \[\wedge,\vee\colon\mathbb{B}_\mathsf{CPL}\times\mathbb{B}_\mathsf{CPL} \to \mathbb{B}_\mathsf{CPL}\]

(defined by [\varphi]\wedge[\psi]:=[\varphi\wedge\psi] and [\varphi]\vee[\psi]:=[\varphi\vee\psi]), this definition states that \Tiny{DUAL}\small{(\wedge,\vee)} iff

for all  [\varphi], [\psi] \in \mathbb{B}_\mathsf{CPL}: [\varphi] \wedge [\psi] = \neg(\neg [\varphi] \vee \neg[\psi]),

which is equivalent to the formulation (2) that was given above

for all \varphi, \psi \in \mathcal{L}_\mathsf{CPL}: \varphi \wedge \psi \equiv \neg(\neg \varphi \vee \neg\psi).

(Note that identity between elements in the Lindenbaum-Tarski algebra boils down to  logical equivalence between the formulas themselves.) Similarly, the first-order quantifiers can be seen as unary operators

    \[\forall,\exists\colon\mathbb{B}_\mathsf{FOL}\to\mathbb{B}_\mathsf{FOL}\]

where \mathbb{B}_\mathsf{FOL} is the Lindenbaum-Tarski algebra of first-order logic (\mathsf{FOL}), which is a cylindric algebra (Henkin et al. 1971), and thus a fortiori a Boolean algebra. Finally, by taking \mathbb{A} and/or \mathbb{B} to be other, more exotic Boolean algebras, the aforementioned definition also allows us to study duality relations in other, less well-known applications  (Demey and Smessaert 2016).

Relations vs. functions. All the duality relations have a number of special properties. For any relation R \in \{\Tiny{ID}\small{,}\Tiny{INEG}\small{,}\Tiny{ENEG}\small{,}\Tiny{DUAL}\small{\}}, one can show that

  • R is deterministic:
  • for all O_1,O_2,O_3\colon\mathbb{A}^n\to\mathbb{B}: if R(O_1,O_2) and R(O_1,O_3), O_2 = O_3,
  • R is serial:
  • for all O_1\colon\mathbb{A}^n\to\mathbb{B}, there exists an O_2\colon\mathbb{A}^n\to\mathbb{B} such that R(O_1,O_2),
  • R is symmetric:
  • for all O_1,O_2\colon\mathbb{A}^n\to\mathbb{B}: R(O_1,O_2) iff R(O_2,O_1).

The first two properties jointly state that for each O_1, there is exactly one O_2 such that R(O_1,O_2). This means that the relation R is essentially a function, and switching from relational to functional notation, we can thus write O_2 = R(O_1).

For example, since \Tiny{DUAL}\small{(\wedge,\vee)}, we can write \vee = \Tiny{DUAL}\small{(\wedge)}, and say that \vee is the (unique) dual of \wedge. However, since \wedge and \vee are seen as binary operators on the Lindenbaum-Tarski algebra \mathbb{B}_{\mathsf{CPL}}, it should be kept in mind that this uniqueness claim ultimately boils down to a logical equivalence claim (see above). For example, consider the operator

    \[O\colon\mathbb{B}_{\mathsf{CPL}}\times\mathbb{B}_{\mathsf{CPL}}\to\mathbb{B}_{\mathsf{CPL}}\]

defined by

    \[O([\varphi],[\psi]) := \neg(\neg[\varphi] \wedge \neg[\psi])\]

It then holds that \Tiny{DUAL}\small{(\wedge,\vee)} and \Tiny{DUAL}\small{(\wedge,O)}, which together entail that \vee = O. The latter is an identity of functions, and thus means that for all [\varphi],[\psi]\in\mathbb{B}_{\mathsf{CPL}}, we have

    \[[\varphi] \vee [\psi] = O([\varphi],[\psi]) = \neg(\neg[\varphi] \wedge \neg[\psi])\]

in other words: for all

    \[\varphi,\psi\in\mathcal{L}_\mathsf{CPL}\]

it holds that

    \[\varphi \vee \psi \equiv \neg(\neg\varphi \wedge \neg\psi)\]

Since each R \in \{\Tiny{ID}\small{,}\Tiny{INEG}\small{,}\Tiny{ENEG}\small{,}\Tiny{DUAL}\small{\}} can be viewed as a function, the symmetry of the relation R can equivalently be expressed as follows: O_2 = R(O_1) iff O_1 = R(O_2), which is itself equivalent to the property that R(R(O)) = O for all operators O\colon\mathbb{A}^n\to\mathbb{B}. This means that the function R is an involution.

Obviously, the definitions of the duality relations/functions can harmlessly be transposed from operators O\colon\mathbb{A}^n\to\mathbb{B} to the outputs of those operators. For example, if the operator O_2\colon\mathbb{A}^n\to\mathbb{B} is the dual of the operator O_1\colon\mathbb{A}^n\to\mathbb{B}, then for all a_1,\dots,a_n\in\mathbb{A}, the element O_2(a_1,\dots,a_n) \in\mathbb{B} can be said to be the dual of the element O_1(a_1,\dots,a_n) \in\mathbb{B}. For example, in this way, we can say not only that \vee is the dual of \wedge, but also that [\varphi]\vee[\psi] is the dual of [\varphi]\wedge[\psi], for all [\varphi],[\psi]\in\mathbb{B}_{\mathsf{CPL}} – or more informally, that \varphi\vee\psi is ‘the’ dual (up to logical equivalence) of \(\varphi\wedge\psi, for all \(\varphi,\psi\in\mathcal{L}_\mathsf{CPL}.

Duality squares. For every operator O\colon\mathbb{A}^n\to\mathbb{B}, one can define the set of four operators

    \[\delta(O) := \{\Tiny{ID}\small{(O)}, \Tiny{ENEG}\small{(O)},\Tiny{INEG}\small{(O)},\Tiny{DUAL}\small{(O)}\}\]

It is natural to view the set \delta(O) as ‘generated’ by the operator O; however, it should be emphasized that \delta(O) can be seen as generated by any of its elements. For example, if we consider \Tiny{DUAL}\small{(O)}, we find that

    \[\delta(\Tiny{DUAL}\small{(O))} =\]

    \[\{\Tiny{ID}\small{(}\Tiny{DUAL}\small{(O))}, \Tiny{ENEG}\small{(}\Tiny{DUAL}\small{(O))},\]

    \[\Tiny{INEG}\small{(}\Tiny{DUAL}\small{(O))},\Tiny{DUAL}\small{(}\Tiny{DUAL}\small{(O))}\} =\]

    \[\{\Tiny{DUAL}\small{(O)},\Tiny{INEG}\small{(O)},\Tiny{ENEG}\small{(O)},\Tiny{ID}\small{(O)}\} =\]

\delta(O). In general, for any O' \in \delta(O), it holds that \delta(O') = \delta(O) (Peters and Westerståhl 2006, p. 134; Westerståhl 2012, p. 205).

The argument above is based on the fact that \delta(O) is ‘closed under duality’, in the sense that applying any of the \Tiny{ID}-, \Tiny{ENEG}-, \Tiny{INEG}– or \Tiny{DUAL}-functions to its elements only yields operators that already belong to \delta(O). This observation is the starting point for the group-theoretical perspective on duality that will be developed in Section 4. The operators in \delta(o) thus constitute natural families (van Benthem 1991, p. 31; Peters and Westerståhl 2006, p. 26), which are often visualized by means of square diagrams. The diagram’s vertices represent the four operators (or formulas), and its edges and diagonals represent the various relations between those operators. Figure 2(a) shows the graphical convention that will be used in this article to visualize these relations.

Visually speaking, duality squares can be presented in a number of different ways, depending on which aspects the author wishes to emphasize. The most widely used presentation can be found in Figure 2(b), in which the \Tiny{ENEG}-, Tiny{INEG}– and \Tiny{DUAL}-relations occupy the square’s diagonals, horizontal and vertical edges, respectively. This presentation thus emphasizes the analogy between the duality square and the well-known Aristotelian square, in which the contradiction, (sub)contrariety and subalternation relations also occupy the diagonals, horizontal and vertical edges, respectively (van Benthem 1991, p. 31; Jaspers 2005, p. 148; Peters and Westerståhl 2006, p. 25, Westerståhl 2012, p. 202); also see Section 5. Figure 2(c) shows an alternative layout, in which the \Tiny{DUAL}-relations occupy the diagonals, thereby graphically reflecting the fact that \Tiny{DUAL} is the combination of \Tiny{ENEG} (which constitutes the vertical edges) and \Tiny{INEG} (which constitutes the horizontal edges) (Löbner 1990, p. 69ff.; Konig 1991, p. 201); also see Section 4. Thirdly, Löbner (1999, p. 57; 2011, p. 488) has argued, on the basis of his phase quantification approach to duality (see Section 2), that \Tiny{INEG} should be seen as the combination of \Tiny{ENEG} and \Tiny{DUAL}, and thus uses squares as in Figure 2(d), in which the former occupies the diagonals. Finally, it should be emphasized that the \Tiny{ID}-relations are not visualized explicitly in any of these three ways of presenting duality squares, since they would simply constitute loops on all vertices of the squares.

Figures 3 and 4 show duality squares for some concrete dualities from logic and language (all these squares follow the presentation of Figure 2(b), and thus have \Tiny{ENEG}-diagonals). The first three squares in Figure 3 correspond to the first three examples of duality in logic that were discussed in Section 1: (a) the propositional connectives of conjunction and disjunction, (b) the universal and existential quantifiers, and (c) the modal operators of necessity and possibility. Furthermore, it should be emphasized that the general perspective on duality in terms of external and internal negation also allows us to draw less standardized duality squares; for example, Figure 3(d) shows the less widely known duality square that is generated by the propositional connective of material implication (\to). Finally, the squares in Figure 4 correspond to two examples of duality in natural language that were discussed in Section 2, namely (a) the quantification adverbs everywhere/somewhere, and (b) the aspectual adverbs already/still.

 

Figure 2: (a) Graphical representations of the duality relations; presentationsof duality squares with (b)ENEG-diagonals, (c)DUAL-diagonals and (d)INEG-diagonals.

Figure 3: Duality squares from logic: (a) conjunction-disjunction, (b) universal-existential, (c) necessity-possibility, (d) implication.

Figure 4: Duality squares from linguistics: (a) everywhere-somewhere, (b) already-still.

Degenerate duality patterns. For some operators O\colon\mathbb{A}^n\to\mathbb{B}, it might happen that \Tiny{DUAL}\small{(O)} = O = \Tiny{ID}\small{(O)}, i.e. O is self-dual. In this case, one can also show that \Tiny{INEG}\small{(O)} = \Tiny{ENEG}\small{(O)}, i.e. O‘s internal and external negation coincide with each other. For example, as was already shown in Section 2, proper names are self-dual in generalized quantifier theory. For another example, consider the identity operator I_\mathbb{A}\colon\mathbb{A}\to\mathbb{A} (for any Boolean algebra \mathbb{A}), which is defined by I_\mathbb{A}(a) := a. For any element a \in A, it holds that

    \[\Tiny{DUAL}\small{(I_\mathbb{A})(a)} = \neg_\mathbb{A} I_\mathbb{A}(\neg_\mathbb{A} a) = \neg_\mathbb{A}\neg_\mathbb{A} a = a = I_\mathbb{A}(a)\]

and thus \Tiny{DUAL}\small{(I_\mathbb{A})} = I_\mathbb{A}, i.e. I_\mathbb{A} is self-dual. Similarly, for any element a\in A it holds that

    \[\Tiny{INEG}\small{(I_\mathbb{A})(a)} = I_\mathbb{A}(\neg_\mathbb{A} a) = \neg_\mathbb{A} a = \neg_\mathbb{A} I_\mathbb{A}(a) = \Tiny{ENEG}\small{(I_\mathbb{A})(a)}\]

and thus \Tiny{INEG}\small{(I_\mathbb{A})} = \Tiny{ENEG}\small{(I_\mathbb{A})}.

Completely analogously, for some operators O\colon\mathbb{A}^n\to \mathbb{B}, it can happen that \Tiny{INEG}\small{(O)} = O = \Tiny{ID}\small{(O)}, i.e. O is its own internal negation. In this case, one can also show that \Tiny{DUAL}\small{(O)} = \Tiny{ENEG}\small{(O)}, i.e. O‘s external negation and dual coincide with each other. Consider, for example, the contingency operator C\colon\mathbb{B}_\mathsf{S5}\to\mathbb{B}_\mathsf{S5}, which is defined by

    \[C([\varphi]) := \Diamond[\varphi]\wedge\Diamond\neg[\varphi] = [\Diamond\varphi\wedge\Diamond\neg\varphi]\]

(recall that \mathbb{B}_\mathsf{S5} is the Lindenbaum-Tarski algebra of the modal logic \mathsf{S5}, which is a modal algebra (Blackburn et al. 2001), and thus a fortiori a Boolean algebra). For any [\varphi]\in\mathbb{B}_\mathsf{S5}, it holds that

    \[\Tiny{INEG}\small{(C)([\varphi])} = C(\neg[\varphi]) = \Diamond\neg[\varphi]\wedge\Diamond\neg\neg[\varphi] = \Diamond[\varphi] \wedge\Diamond\neg[\varphi]= C([\varphi])\]

and thus \Tiny{INEG}\small{(C)} = C. Similarly, it holds that

    \[\Tiny{DUAL}\small{(C)([\varphi])} = \neg C(\neg[\varphi]) = \neg(\Diamond\neg[\varphi]\wedge\Diamond\neg\neg[\varphi]) = \neg(\Diamond[\varphi]\wedge\Diamond\neg\varphi) =\]

    \[\Tiny{ENEG}\small{(C)([\varphi])}\]

and thus \Tiny{DUAL}\small{(C)} = \Tiny{ENEG}\small{(C)}.

We have now discussed the possibility of an operator coinciding with its dual, or with its internal negation. This naturally leads to the question whether there are also operators that coincide with their external negation. It is easy to see, however, that there exist no non-trivial operators with this property. After all, if O\colon\mathbb{A}^n\to\mathbb{B} is its own external negation, then for all n-tuples \overline{a} \in A^n, it holds that

    \[O(\overline{a}) = \neg_\mathbb{B} O(\overline{a})\]

and hence,

    \[\top_\mathbb{B} = O(\overline{a}) \vee_\mathbb{B}\neg_\mathbb{B}O(\overline{a})=O(\overline{a}) \vee_\mathbb{B}O(\overline{a})=O(\overline{a})\]

and also

    \[\bot_\mathbb{B} = O(\overline{a}) \wedge_\mathbb{B}\neg_\mathbb{B}O(\overline{a})=O(\overline{a}) \wedge_\mathbb{B}O(\overline{a})=O(\overline{a})\]

which means that \mathbb{B} is the trivial Boolean algebra in which \bot_\mathbb{B}= \top_\mathbb{B} (in logical terms: \mathbb{B} is the Lindenbaum-Tarski algebra of a logical system that is inconsistent).

Whenever an operator O is its own dual or internal negation, the set \delta(O) does not contain four, but only two distinct operators (Peters and Westerståhl 2006, p. 134;Westerståhl 2012, p. 205), and thus cannot be visualized using an ordinary duality square. Recall the standard presentation of the duality square (with horizontal \Tiny{INEG}– and vertical \Tiny{DUAL}-edges) in Figure 2(b), which is repeated here as Figure 5(a). If O = \Tiny{DUAL}\small{(O)}, then \delta(O) = \{\Tiny{ID}\small{(O)},\Tiny{INEG}\small{(O)}\}, and thus, the duality square in Figure 5(a) degenerates into the binary horizontal duality diagram in Figure 5(b). Analogously, if O = \Tiny{INEG}\small{(O)}, then \delta(O) = \{\Tiny{ID}\small{(O)},\Tiny{DUAL}\small{(O)}\}, and thus, the duality square in Figure 5(a) degenerates into the binary vertical duality diagram in Figure 5(c).

Figure 5: (a) Ordinary duality square, (b) degenerate duality pattern for an operator that is its own dual, (c) degenerate duality pattern for an operator that is its own internal negation.

Beyond external and internal negation. In the introduction, it was emphasized that this article mainly focuses on duality phenomena that arise in logical and natural languages. As was illustrated in Sections 1 and 2, these dualities can informally be characterized in terms of internal and external negation. In this section, this informal characterization was made mathematically precise, by appealing to operators O\colon\mathbb{A}^n\to\mathbb{B} and viewing the internal and external negation as the negations \neg_\mathbb{A} and \neg_\mathbb{B} of the source and target Boolean algebras \mathbb{A} and \mathbb{B}, respectively. However, it should be emphasized that in the broader mathematical perspective on duality (Gowers 2008; Kabakov et al. ~ 2014), internal/external negation plays a less central role. For example, in category-theoretic terms, conjunction and disjunction are characterized as follows (Mac Lane 1998; Davey and Priestley 2002):

\varphi\wedge\psi is the unique
formula \pi such that:
\pi entails \varphi
\pi entails \psi
– for all \alpha: if \alpha entails \varphi and \psi,
\hspace{0.325cm} then \alpha entails \pi
\varphi\vee\psi is the unique
formula \pi such that:
\varphi entails \pi
\psi entails \pi
– for all \alpha: if \varphi and \psi entail \alpha,
\hspace{0.325cm} then \pi entails \alpha

From this perspective, the duality of conjunction and disjunction is thus not characterized in terms of internal and external negation, but rather in terms of systematically ‘reversing’ the direction of entailment (a similar connection between duality and ‘reversing’ the direction of polarity transitions shows up in Löbner’s phase quantification theory, as discussed in Section 2). This difference should not be exaggerated, however, as can already be seen from the law of contraposition, in which the ideas of negation and reversal are brought together: \varphi\to\psi \equiv \neg\psi\to\neg\varphi.

4. A Group-Theoretical Approach to Duality

The Klein four group. When \Tiny{ID}, \Tiny{ENEG}, \Tiny{INEG} and \Tiny{DUAL} are viewed as functions, they map each operator O\colon\mathbb{A}^n\to\mathbb{B} onto the operators

    \[\Tiny{ID}\small{(O)},\Tiny{ENEG}\small{(O),}\]

    \[\Tiny{INEG}\small{(O)},\Tiny{DUAL}\small{(O)}\colon\mathbb{A}^n\to\mathbb{B}\]

Since the input and output of the functions \Tiny{ID}, \Tiny{ENEG}, \Tiny{INEG} and \Tiny{DUAL} are of the same type (namely: operators \mathbb{A}^n\to\mathbb{B}), they can be applied repeatedly. For example, starting with an operator O\colon\mathbb{A}^n\to\mathbb{B}, we can apply \Tiny{INEG} to it to obtain the operator \Tiny{INEG}\small{(O)}\colon\mathbb{A}^n\to\mathbb{B}; by applying \Tiny{ENEG} to the latter we obtain the operator \Tiny{ENEG}\small{(}\Tiny{INEG}\small{(O))}\colon\mathbb{A}^n\to\mathbb{B}. It follows immediately from the definitions of the duality relations/functions that \Tiny{ENEG}\small{(}\Tiny{INEG}\small{(O))} = \Tiny{DUAL}\small{(O)}. Since this holds independently of the concrete operator O, we can write \Tiny{ENEG} \small{\circ} \Tiny{INEG} \small{=} \Tiny{DUAL}, which means that applying \Tiny{INEG} and then \Tiny{ENEG} (to some operator) yields the same result as applying \Tiny{DUAL} (to that same operator). In a similar vein, since for all operators O\colon\mathbb{A}^n\to\mathbb{B} it holds that \Tiny{INEG}\small{(}\Tiny{INEG}\small{(O))} = O = \Tiny{ID}\small{(O)}, we can write \Tiny{INEG} \circ \Tiny{INEG} = \Tiny{ID}. In this way, we obtain a large number of functional identities that descibe the behavior of the duality and internal/external negation functions:

(44)   \begin{alignat*}{3} \small{ID} & \circ \small{ID} & = & \; \ \small{ID} & = & \ \small{DUAL} & \circ \small{DUAL} \notag\\ \small{ENEG} & \circ \small{ENEG} & = & \; \ \small{ID} & = & \ \small{INEG} & \circ \small{INEG} \notag\\ \small{INEG} & \circ \small{ENEG} & = & \ \small{DUAL} & = & \ \small{ENEG} & \circ \small{INEG} \notag\\ \small{INEG} & \circ \small{DUAL} & = & \ \small{ENEG} & = & \ \small{DUAL} & \circ \small{INEG} \notag\\ \small{DUAL} & \circ \small{ENEG} & = & \ \small{INEG} & = & \ \small{ENEG} & \circ \small{DUAL} \notag\\ \end{alignat*}

These identities can be summarized by stating that the functions \Tiny{ID}, \Tiny{ENEG}, \Tiny{INEG} and \Tiny{DUAL} jointly form a group that is isomorphic to the Klein four group V_4 (German: Kleinsche Vierergruppe). Its Cayley table looks as follows:

\begin{array}{ c|c c c c } \circ & \small{ID} & \small{ENEG} & \small{INEG} & \small{DUAL} \\ \hline \small{ID} & \small{ID} & \small{ENEG} & \small{INEG} & \small{DUAL} \\ \small{ENEG} & \small{ENEG} & \small{ID} & \small{DUAL} & \small{INEG} \\ \small{INEG} & \small{INEG} & \small{DUAL} & \small{ID} & \small{ENEG} \\ \small{DUAL} & \small{DUAL} & \small{INEG} & \small{ENEG} & \small{ID} \end{array}

 
The fact that duality behavior can be described by means of V4 was already noted by authors such as Piaget (1949), Gottschalk (1953), Löbner (1990), van Benthem (1991) and Peters and Westerståhl (2006). However, many of them used slightly differing labels for the group elements; here is an overview table:

Piaget Gottschalk Löbner Peters & Westerståhl
\small{ID} identité (\small{I}) identity (\small{E}) indentity
\small{ENEG} inversion (\small{N}) negational (\small{N}) negation outer negation
\small{INEG} réciprocation (\small{R}) contradual (\small{C}) subnegation inner negation
\small{DUAL} corrélation (\small{C}) dual (\small{E}) dual dual

This group-theoretical perspective also allows us to describe the degenerate cases of operators that are their own duals or their own internal negations. Note that these cases are characterized by the identities \Tiny{DUAL} = \Tiny{ID} and \Tiny{INEG} = \Tiny{ID}, respectively. Note that if \Tiny{DUAL} = \Tiny{ID}, then also \Tiny{ENEG} = \Tiny{INEG}, and thus V_4 collapses into a group that is isomorphic to \mathbb{Z}_2; see the left and middle Cayley tables below and also recall Figure 5(b). Similarly, if \Tiny{INEG} = \Tiny{ID}, then also \Tiny{ENEG} = \Tiny{DUAL}, and thus V_4 again collapses into a group that is isomorphic to \mathbb{Z}_2; see the right and middle Cayley tables below and also recall Figure 5(c).

\begin{array}{c|c c} \circ & \small{ID} & \small{INEG} \\ \hline \small{ID} & \small{ID} & \small{INEG} \\ \small{INEG} & \small{INEG} & \small{ID} & \end{array} \begin{array}{c|c c} \circ & \small{0} & \small{1} \\ \hline \small{0} & \small{0} & \small{1} \\ \small{1} & \small{1} & \small{0} & \end{array} \begin{array}{c|c c} \circ & \small{ID} & \small{DUAL} \\ \hline \small{ID} & \small{ID} & \small{DUAL} \\ \small{DUAL} & \small{DUAL} & \small{ID} & \end{array}

Finally, it should be noted that the Klein four group V_4 is isomorphic to the direct product of \mathbb{Z}_2 with itself, i.e. V_4 \cong \mathbb{Z}_2 \times \mathbb{Z}_2 = \mathbb{Z}_2^2. Although this fact is well-known in group theory, its logico-linguistic significance has only recently begun to be explored. The Cayley table for \mathbb{Z}_2 × \mathbb{Z}_2 looks as follows:

\begin{array}{ c|c c c c } \circ & (0, 0) & (1, 0) & (0, 1) & (1, 1) \\ \hline (0, 0) & (0, 0) & (1, 0) & (0, 1) & (1, 1) \\ (1, 0) & (1, 0) & (0, 0) & (1, 1) & (0, 1) \\ (0, 1) & (0, 1) & (1, 1) & (0, 0) & (1, 0) \\ (1, 1) & (1, 1) & (0, 1) & (1, 0) & (0, 0) \end{array}

 
Comparing the Cayley tables for \mathbb{Z}_2 × \mathbb{Z}_2 and the Klein four group V_4, we see that the concrete isomorphism looks as follows:

(45)   \begin{equation*} \small{ID} \leftrightarrow (0, 0),\;\> \small{ENEG} \leftrightarrow (1, 0),\;\> \small{INEG} \leftrightarrow (0, 1),\;\> \small{DUAL} \leftrightarrow (1, 1). \end{equation*}

This group-theoretical isomorphism turns out to be very informative: 0 and 1 represent the number of times negation is being applied in a given Boolean algebra, and the left and right coordinates stand for the target and source Boolean algebra (i.e. external and internal negation), respectively. For example, \Tiny{ENEG} corresponds to (1, 0), which represents 1 external negation and 0 internal negations. Similarly, \Tiny{INEG} corresponds to (0, 1), which represents 0 external negations and 1 internal negation (keeping in mind that internal negation applies to all arguments). Using the conventions that \neg \ _\mathbb{A}^0 a := a and \neg \ _\mathbb{A}^1 a := \neg \ _\mathbb{A}a for all a \in \mathbb{A}, we thus find for any operator \small{O}: \mathbb{A}^n \rightarrow \mathbb{B} and i, \ k \in \{0, 1\}:

(46)   \begin{equation*} (i, \ k)(\small{O})(a_1, ...,a_n) = \neg \ ^i_\mathbb{B}\small{O}(\neg \ ^k_\mathbb{A} a_1, ... , \neg ^k_\mathbb{A} a_n). \end{equation*}

Representing V_4 as \mathbb{Z}_2 \times \mathbb{Z}_2 thus gives us a firm syntactic handle on duality: it shows how duality behavior arises out of the interplay of the independent behaviors (0 or 1) of an external and an internal negation (resp. left and right coordinate).

Composed operators. The group-theoretical account of duality can be extended in a number of different ways. For example, Demey (2012a) has used it to study the duality behavior of composed operators. Given operators \small{O}_1: \mathbb{A}^n \rightarrow \mathbb{B} and \small{O}_2: \mathbb{B} \rightarrow \mathbb{C}, we will write \small{O}_2 \circ \small{O}_1: \mathbb{A}^n \rightarrow \mathbb{C} for the composed operator that first applies \small{O}_1 to the arguments, and then \small{O}_2. For simplicity, we will assume that \small{O}_2 is unary, but this assumption is not essential. In this article, we will focus on the basic example \forall \circ \square from modal syllogistics (Buridan 2001; Read 2012). A more linguistically motivated example, viz. possessives with multiple quantifiers, such as three athletes of each country, is discussed in Westerståhl (2012).

Each of \small{O}_1 and \small{O}_2 has its own internal and external negation, but it is easy to see that in the composed operator \small{O}_2 \circ \small{O}_1, the external negation of \small{O}_1 coincides with the internal negation of \small{O}_2. As a consequence, the composed operator \small{O}_2 \circ \small{O}_1 has three negations, namely external, intermediate, and internal (formally: \neg _\mathbb{C}, \neg _\mathbb{B}, and \neg _\mathbb{A}, respectively). Since each of these 3 negations may or may not be applied, \small{O}_2 \circ \small{O}_1 gives rise to 2^3 = 8 operators. As an example, consider the case of \forall \circ \square in (47):

(47)   \begin{equation*} \begin{array}{c|c|c|c|c} \phantom{\neg} \small{O}_2 \phantom{\neg} \small{O}_1 \phantom{\neg} &\phantom{\neg} \forall x \phantom{\neg} \square \phantom{\neg} \small{P}(x) & & \neg \small{O}_2 \neg \small{O}_1 \neg & \neg \forall x \neg \square \neg \small{P}(x) \\ \phantom{\neg} \small{O}_2 \phantom{\neg} \small{O}_1 \neg &\phantom{\neg} \forall x \phantom{\neg} \square \neg \small{P}(x) & & \neg \small{O}_2 \neg \small{O}_1 \phantom{\neg} & \neg \forall x \neg \square \phantom{\neg} \small{P}(x) \\ \phantom{\neg} \small{O}_2 \neg \small{O}_1 \phantom{\neg} &\phantom{\neg} \forall x \neg \square \phantom{\neg} \small{P}(x) & & \neg \small{O}_2 \phantom{\neg} \small{O}_1 \neg & \neg \forall x \phantom{\neg} \square \neg \small{P}(x) \\ \neg \small{O}_2 \phantom{\neg} \small{O}_1 \phantom{\neg} &\neg \forall x \phantom{\neg} \square \phantom{\neg} \small{P}(x) & & \phantom{\neg} \small{O}_2 \neg \small{O}_1 \neg & \phantom{\neg} \forall x \neg \square \neg \small{P}(x) \\ \end{array} \end{equation*}

In comparison to single operators, we see that composed operators have one additional negation, and hence, it should not be surprising that their duality behavior is not governed by \mathbb{Z}_2 \times \mathbb{Z}_2, but rather by \mathbb{Z}_2 \times \mathbb{Z}_2 \times \mathbb{Z}_2. Next to \Tiny{INEG} and \Tiny{ENEG}, there is also the intermediate negation function \Tiny{MNEG}, and the isomorphism given in (45) is generalized to the one defined by (48):

(48)   \begin{equation*} \small{ID} \leftrightarrow (0, 0, 0),\;\> \small{ENEG} \leftrightarrow (1, 0, 0),\;\> \small{MNEG} \leftrightarrow (0, 1, 0),\;\> \small{INEG} \leftrightarrow (0, 0, 1). \end{equation*}

In analogy to (46), it is now again possible to succinctly describe the effects of these operations:

(49)   \begin{equation*} (i,j,k)(\small{O}_2 \circ \small{O}_1)(a_1, ...,a_n) = \neg \ ^i_\mathbb{C}\small{O}_2\neg^j_\mathbb{B}\small{O}_1(\neg \ ^k_\mathbb{A} a_1, ... , \neg ^k_\mathbb{A} a_n). \end{equation*}

We also see that composed operators give rise to a much richer duality behavior than single operators. Recall that in the case of single operators, duality can be seen as the combination of the external and internal negations (\Tiny{DUAL} = \Tiny{ENEG} \circ \Tiny{INEG}). In the case of composed operators, however, we have three negations, and thus three pairwise combinations: \Tiny{ENEG} \circ \Tiny{INEG}, \Tiny{ENEG} \circ \Tiny{MNEG}, and \Tiny{MNEG} \circ \Tiny{INEG}. Although the first of these seems to be closest to what is classically called ‘duality’, the other two can plausibly be seen as (non-standard) duality operations too. Finally, there is also the operation \Tiny{ENEG} \circ \Tiny{MNEG} \circ \Tiny{INEG}, which operates on all negations simultaneously.

Visualizing these duality patterns cannot be done by means of a square, but rather requires a duality cube. For example, Figure 6 shows a duality cube for the composed operator \forall \circ \square ; analogously, Westerståhl (2012) draws a duality cube for possessives with multiple quantifiers. Demey (2012a) makes use of the group-theoretical perspective to study the internal structure of this cube. It is a well known group-theoretical fact that the group \mathbb{Z}_2 \times \mathbb{Z}_2 \times \mathbb{Z}_2 has exactly 7 subgroups that are isomorphic to V_4. These can naturally be partitioned into three families, based on their number of ‘basic’ operations (i.e. operations governing a single negation: \Tiny{ENEG}, \Tiny{MNEG} and \Tiny{INEG}): (a) the first family consists of three groups that contain two basic operations, (b) the second family consists of three groups that contain one basic operation, and (c) the third family consists of a single group that does not contain any basic operations. Examples of groups from each of these families are given in (50a–c), respectively.

(50)   \begin{equation*} \begin{array} ((a) \; \; \{ \small{ID}, \small{ENEG}, \small{INEG}, \small{ENEG} \circ \small{INEG} \} \\ (b) \; \; \{ \small{ID}, \small{ENEG}, \small{MNEG} \circ \small{INEG}, \small{ENEG} \circ \small{MNEG} \circ \small{INEG}\} \\ (c) \; \; \{ \small{ID}, \small{ENEG} \circ \small{INEG}, \small{MNEG} \circ \small{INEG}, \small{ENEG} \circ \small{MNEG}\} \\ \end{array} \end{equation*}

Each of these groups defines two complementary ‘duality squares’, and we thus find a total number of 7 \times 2 = 14 ‘duality squares’ inside the duality cube. (We are using the term ‘duality square’ inside scare quotes here, because some of these squares visualize non-standard duality operations that involve \Tiny{MNEG}; see above.) Note that, in contrast to the groups of families (a) and (b), the non-\Tiny{ID} elements of the group in family (c) pairwise share a basic operation. Demey (2012a) argues that this difference in group-theoretical structure correlates with a difference in geometric embedding of the squares inside the cube.

Generalized Post duality. The group-theoretical account described above conforms to the basic requirement that internal negation be applied to all arguments of a given operator; see the k-superscripts in (46) and (49). Although the most canonical examples of duality indeed obey this requirement (recall the example of conjunction/disjunction from Section 1), there are also operators whose duality behavior seems to violate this requirement. For example, it was shown in Section 2 that in the relational perspective on generalized quantifiers, internal negation is applied only to the second argument—so that the internal negation of \small{Q}(\small{A}, \small{B}) is \small{Q}(\small{A}, \neg \small{B}), rather than \small{Q}(\neg \small{A}, \neg \small{B}). Similarly, in syllogistics one can independently study the effects of predicate negation—as in \small{Q}(\small{A}, \neg \small{B})—and of subject negation—as in \small{Q}(\neg \small{A}, \small{B}) (Keynes 1884; Johnson 1921; Reichenbach 1952; Hacker 1975). Finally, in public announcement logic, the dual of [ \ !\varphi \ ] \ \psi is defined as [ \ !\neg \varphi \ ] \ \neg \psi, so the internal negation of the binary [ \ ! \ \cdot \ ] \ \cdot operator is applied only to its second argument (\psi) (Demey 2012b).

Figure 6: Duality cube for the composed operator \forall \circ \square

If we drop the requirement that internal negation be applied to all arguments, the behavior that arises is called generalized Post duality (Humberstone 2011, p. 410ff.; Urquhart 2008). Consider an \small{n}-ary operator \small{O}: \mathbb{A}^n \leftarrow \mathbb{B}. This operator has 1 external and \small{n} independent internal negations. Since each of these \small{n} + 1 negations may or may not be applied, \small{O} gives rise to 2^{n+1} operators. As an example, consider the binary operator of conjunction:

(51)   \begin{equation*} \begin{array}{c|c|c|c|c} \phantom{\neg} \small{O}( \phantom{\neg} , \phantom{\neg} )& \phantom{\neg} (\phantom{\neg} p \wedge \phantom{\neg} q )\ & & \neg \small{O}( \neg, \neg) & \neg ( \neg p \wedge \neg q) \\ \phantom{\neg} \small{O}( \phantom{\neg} , \neg )&\phantom{\neg} (\phantom{\neg} p \wedge \neg q )\ & & \neg \small{O}( \neg, \phantom{\neg}) & \neg ( \neg p \wedge \phantom{\neg} q) \\ \phantom{\neg} \small{O}( \neg , \phantom{\neg} )&\phantom{\neg} (\neg p \wedge \phantom{\neg} q )\ & & \neg \small{O}( \phantom{\neg}, \neg) & \neg ( \phantom{\neg} p \wedge \neg q) \\ \neg \small{O}( \phantom{\neg} , \phantom{\neg} )& \neg (\phantom{\neg} p \wedge \phantom{\neg} q )\ & & \phantom{\neg} \small{O}( \neg, \neg) & \phantom{\neg} ( \neg p \wedge \neg q) \\ \end{array} \end{equation*}

In comparison to the ordinary duality behavior of a binary operator, we thus have \small{n}+1 rather than 2 independent negations, and generalized Post duality behavior is governed by the group \mathbb{Z}_2^{n+1} rather than \mathbb{Z}_2^2 (Libert 2012). Next to \Tiny{ENEG}, the operation of \Tiny{INEG} is split into \Tiny{INEG}_1, . . . , \Tiny{INEG}_n, with \Tiny{INEG}_i operating on the operator’s \small{i}^{th} argument, for 1 \ \leq \ \small{i} \ \leq \ \small{n}. Furthermore, the isomorphism given in (45) can be generalized to the one defined by (52):

(52)   \begin{alignat*}{3} \small{ID} & \; \leftrightarrow \; (0, 0, 0, . . . , 0, 0) & & \small{ENEG} & \; \leftrightarrow \; (1, 0, 0, . . . , 0, 0) \notag \\ \small{INEG}_1 & \; \leftrightarrow \; (0, 0, 0, . . . , 0, 0) & \; \cdot \cdot \cdot \; \; & \small{INEG}_n & \; \leftrightarrow \; (0, 0, 0, . . . , 0, 1) &. \\ \end{alignat*}

In analogy to (46), the effects of these operations can be described succinctly by means of (53). Note that (46) can be seen as a special case of (53), by requiring that \small{k}_1 = \small{k}_2 = ... = \small{k}_n.

(53)   \begin{equation*} (i,k_1,...,k_n)(\small{O})(a_1, ...,a_n) = \neg \ ^i_\mathbb{B}\small{O}(\neg \ ^{k_1}_\mathbb{A} a_1, ... , \neg ^{k_n}_\mathbb{A} a_n). \end{equation*}

As was the case with the duality behavior of a composed operator, we see that the generalized duality behavior of an n-ary operator is much richer than its ‘ordinary’ duality behavior. Consider again the binary operator of conjunction. If both arguments can be negated independently, there are several combinations of external and internal negation (\Tiny{ENEG} \circ \Tiny{INEG1}, \Tiny{ENEG} \circ \Tiny{INEG}_2 and \Tiny{ENEG} \circ \Tiny{INEG}_1 \circ \Tiny{INEG}_2), all of which can plausibly be called duality operations. (The last one of these involves negating all arguments, and thus coincides with ‘ordinary’ duality.) As a consequence, visualizing the generalized duality behavior of conjunction requires a duality cube, as in Figure 7. Note that the diagonal plane that spans the front left and back right vertical edges of this cube corresponds to the ‘ordinary’ duality square for conjunction (see Figures 2(c) and 3(a)).

Finally, it should be noted that the duality cubes in Figures 6 and 7 are highly similar, which is due, of course, to the fact that they are two distinct manifestations of the group \mathbb{Z}_2^3 (and can thus serve as two distinct concrete interpretations of the abstract cube in Moretti (2012, p. 88)). This illustrates the strong connection between the ‘ordinary’ duality behavior of composed operators on the one hand and the generalized duality behavior of single (binary) operators on the other. Both cases involve creating an additional negation: the former achieves this by ‘splitting’ the operator, while the latter achieves it by ‘splitting’ the argument positions.

Figure 7: ‘Generalized Post duality’ cube for the binary operator \wedge.

5. Duality Relations and Aristotelian Relations

The Aristotelian relations. Next to the duality relations, there is another widely known set of logical relations, namely the Aristotelian relations, which were originally defined in the logical works of Aristotle (Ackrill 1961). These are defined relative to some background logical system \textsf{S}, which is assumed to have connectives expressing Boolean negation (\neg), conjunction (\wedge) and implication (\rightarrow), and a model-theoretic semantics (\models). Formally, the Aristotelian relations are defined as follows: the formulas \varphi and \psi are said to be

(54)   \begin{equation*} \begin{array}{l l l l l} \textsf{S}\textrm{-}contradictory & \mathrm{iff} & \textsf{S} \> \models \neg (\varphi \wedge \psi ) & \mathrm{and} & \textsf{S} \> \models \neg (\neg \varphi \wedge \neg \psi ), \\ \textsf{S}\textrm{-}contrary & \mathrm{iff} & \textsf{S} \> \models \neg (\varphi \wedge \psi ) & \mathrm{and} & \textsf{S} \> \not \models \neg (\neg \varphi \wedge \neg \psi ), \\ \textsf{S}\textrm{-}subcontrary & \mathrm{iff} & \textsf{S} \> \not \models \neg (\varphi \wedge \psi ) & \mathrm{and} & \textsf{S} \> \models \neg (\neg \varphi \wedge \neg \psi , \\ \textsf{S}\textrm{-}subalternation & \mathrm{iff} & \textsf{S} \> \models \varphi \rightarrow \psi & \mathrm{and} & \textsf{S} \> \not \models \psi \rightarrow \varphi , \\ \end{array} \end{equation*}

When the system \textsf{S} is clear from the context, it is often left implicit (Smessaert and Demey 2014). Informally, two formulas are contradictory iff they cannot be true together and cannot be false together; they are contrary iff they cannot be true together but may be false together; they are subcontrary iff they cannot be false together but may be true together; they are in subalternation iff the first one entails the second one but not vice versa. Finally, it should be noted that this definition of the Aristotelian relations can be generalized to arbitrary Boolean algebras, just like the definition of the duality relations provided in Section 3 (Demey and Smessaert 2016). However, since this generalization is less relevant for our current concerns, it will not be discussed here.

The Aristotelian relations holding between a given set of formulas are often visualized by means of Aristotelian diagrams (based on graphical conventions such as the one shown in Figure 8(d)). The most widely known of these diagrams is the so-called ‘square of oppositions’, which comprises 4 formulas and the 6 Aristotelian relations holding between them. For example, Figure 8 shows Aristotelian squares involving (a) the propositional connectives of conjunction and disjunction, (b) the universal and existential quantifiers, and (c) the modal operators of necessity and possibility.

Figure 8: ‘Aristotelian squares: (a) conjunction-disjunction, (b) universal existential, (c) necessity possibility; (d) graphical representations of the Aristotelian relations.

Similarities. The Aristotelian squares in Figure 8(a–c) closely resemble the duality squares in Figure 3(a–c), respectively. In particular: (i) on the diagonals, the duality relation \Tiny{ENEG} corresponds to the Aristotelian relation of contradiction, (ii) on the vertical edges, the duality relation \Tiny{DUAL} corresponds to the Aristotelian relation of subalternation, and (iii) on the horizontal edges, the duality relation \Tiny{INEG} corresponds to the Aristotelian relations of contrariety and subcontariety. These strong similarities might explain why authors such as D’Alfonso (2012), Meles (2012) and Schumann (2013) have come close to straightforwardly identifying the two types of squares—for example, by using Aristotelian terminology to describe the duality square (or vice versa), or by viewing one as a generalization of the other.

Furthermore, both Aristotelian and duality diagrams have been used by linguists to explain certain lexicalization patterns in natural languages. For example, Horn (1989) and Jaspers (2005) make use of the Aristotelian relations to explain the so-called non-lexicalization of the O-corner, i.e. the observation that natural languages have primitive lexical items for the quantifiers all, some and none, but not for not all (the latter’s lexicalization as a single word—for example: *nall— does not occur in natural language). The same asymmetry can be found in the lexicalization pattern of the propositional connectives: natural languages have primitive lexical items for and, or and nor, but not for not and (the latter’s lexicalization as a single word—for example: *nand—does not occur in natural language). These linguistic phenomena are also explained by Löbner (1990, 2011), but his phase quantification account is based on the duality relations, rather than the Aristotelian relations. Finally, it should be noted that the Aristotelian account of these lexical asymmetries has recently been generalized beyond the square by Seuren and Jaspers (2014).

Dissimilarities. As noted by Löbner (2011), Chow (2012) and Westerståhl (2012), there are also several differences between the duality square and the Aristotelian square. For example, although duality seems to correspond to subalternation, the former relation is symmetric, while the latter is asymmetric. Furthermore, although both sets of relations contain four members, there is no clean one-to-one mapping in either direction: on the one hand, the Aristotelian relations of contrariety and subcontrariety correspond to a single duality relation (\Tiny{INEG}), and on the other hand, the duality relation \Tiny{ID} does not correspond to any Aristotelian relation whatsoever. (However, Smessaert and Demey (2014) introduce a quasi-Aristotelian relation that holds precisely between a formula and itself, and thus does correspond to the duality relation \Tiny{ID}.)

Another difference concerns sensitivity to the specific axioms of the background logic (Demey 2015). Consider, for example, the modal operators \square, \lozenge: \mathbb{B}_{\textsf{S}} \rightarrow \mathbb{B}_{\textsf{S}}, where \mathbb{B}_{\textsf{S}} is the Lindenbaum-Tarski algebra of some normal modal logic \textsf{S}. The Aristotelian relation holding between these operators depends on the logical system \textsf{S}: in normal modal systems that are at least as strong as \textsf{KD}, there is a subalternation from \square p to \lozenge p, but in weaker normal modal systems, there is no Aristotelian relation at all between these two formulas (Hughes and Cresswell 1996). Nevertheless, in all of these modal systems, it is the case that \square \varphi is logically equivalent to \neg \lozenge \neg \varphi for all formulas \varphi \in \mathcal{L}_{\textsf{S}}, and hence [\varphi] = \neg \lozenge \neg [\varphi] for all [\varphi] \in \mathbb{B}_{\textsf{S}}. This means exactly that \Tiny{DUAL} (\square, \lozenge), and hence the duality relation holding between \square and \lozenge holds independently of the specific axioms of the logical system \textsf{S}.

At this point, it might be objected that the duality relations are logic-sensitive after all; for example, conjunction and disjunction are dual to one another in classical propositional logic (\textsf{CPL}), but not in intuitionistic propositional logic (\textsf{IPL}). However, the Lindenbaum-Tarski algebra of \textsf{IPL} is itself not a Boolean algebra (but rather a Heyting algebra), and thus falls outside the scope of the definition of the duality relations that was provided in Section 3.

Another difference between the duality and the Aristotelian relations is that the former, but not the latter, are functional. As was already discussed in Section 3, every formula has exactly one internal negation, exactly one external negation, and exactly one dual (up to logical equivalence). By contrast, the Aristotelian relations are not functional: for example, a given formula might be contrary to several (non-equivalent) formulas. As illustrated by Smessaert (2012), this difference becomes much more apparent if we move from squares to larger diagrams. For example, Figures 9(a–b) show an Aristotelian and a duality diagram for the same set of six modal formulas. Consider the formula p. Within the Aristotelian hexagon, this formula has two (non-equivalent) contraries, namely \square \neg p and \lozenge p \wedge \lozenge \neg p. From a duality perspective, the first of these two formulas is the internal negation of \square p, but the second one stands in no duality relation at all to \square p. The duality ‘hexagon’ in Figure 9(b) thus ultimately turns out to consist of two independent components: the ordinary duality square in Figure 9(c) and the degenerate duality pattern (containing two formulas that are their own internal negations) in Figure 9(d).

Figure 9: (a) Aristotelian hexagon (for a modal system that is at least as strong as \textsf{KD}, (b) duality ‘hexagon’, and (c–d) its two components.

Finally, it should also be noted that it is perfectly possible for two operators/formulas to stand in a duality relation without standing in any Aristotelian relation, or vice versa. Moving to the level of diagrams, this means that it is possible for four operators/formulas to constitute a duality square without constituting an Aristotelian square, or vice versa (Löbner 1986). For example, the aspectual adverbs already, still, not yet and no longer constitute a duality square—see Figure 4(b)—, but not an Aristotelian square: for example, already and still are each other’s duals, but there is no subalternation between them in either direction. Analogously, the modal formulas \square p, \square \vee \square \neg p, \lozenge \neg p and \lozenge p \wedge \lozenge \neg p constitute an Aristotelian square (embedded inside the Aristotelian hexagon in Figure 9(a) with a counterclockwise rotation of 120◦), but not a duality square: for example, \square p and \lozenge p \wedge \lozenge \neg p are contraries, but there is no duality relation between them. In fact, looking at these four modal formulas in the duality ‘hexagon’ in Figure 9(b), we see that \square p \vee \square \neg p and \lozenge p \wedge \lozenge \neg p by themselves constitute a degenerate duality pattern (Figure 9(d)), while \square p and \lozenge \neg p belong to another, ‘real’ duality square (Figure 9(c)).

6. References and Further Reading

  • Ackrill, J. (1961). Aristotle’s Categories and De Interpretatione. Clarendon Press, Oxford.
  • Barwise, J. and Cooper, R. (1981). Generalized quantifiers and natural language. Linguistics and Philosophy, 4:159–219.
  • Blackburn, P., de Rijke, M., and Venema, Y. (2001). Modal Logic. Cambridge University Press, Cambridge.
  • Brisson, C. (2003). Plurals, All, and the nonuniformity of collective predication predication. Linguistics and Philosophy, 26:129–184.
  • Buridan, J. (2001). Summulae de Dialectica. Translated by Gyula Klima. Yale University Press, New Haven, CT.
  • Chow, K. (2012). General patterns of opposition squares and 2n-gons. In Beziau, J.-Y. and Jacquette, D., editors, Around and Beyond the Square of Opposition, pages 263–275. Springer, Basel.
  • D’Alfonso, D. (2012). The square of opposition and generalized quantifiers. In Beziau, J.-Y. and Payette, G., editors, Around and Beyond the Square of Opposition, pages 219–227. Springer, Basel.
  • Davey, B. A. and Priestley, H. A. (2002). Introduction to Lattices and Order (Second Edition). Cambridge University Press, Cambridge.
  • Demey, L. (2012a). Algebraic aspects of duality diagrams. In Philip T. Cox, B. P. and Rodgers, P., editors, Diagrammatic Representation and Inference, Lecture Notes in Computer Science (LNCS) 7352, pages 300–302. Springer, Berlin.
  • Demey, L. (2012b). Structures of oppositions for public announcement logic. In Beziau, J.-Y. and Jacquette, D., editors, Around and Beyond the Square of Opposition, pages 313–339. Springer, Basel.
  • Demey, L. (2015). Interactively illustrating the context-sensitivity of Aristotelian diagrams. In Christiansen, H., Stojanovic, I., and Papadopoulos, G., editors, Modeling and Using Context, LNCS 9405, pages 331–345. Springer.
  • Demey, L. and Smessaert, H. (2016). Metalogical decorations of logical diagrams. Logica Universalis, 10:233–292.
  • Dowty, D. (1987). Collective predicates, distributive predicates, and All. In Marshall, F., editor, Proceedings of the 3rd Eastern States Conference on Linguistics (ESCOL), pages 97–115. Ohio State University, Columbus, OH.
  • Freudenthal, H. (1960). Lincos. Design of a Language for Cosmic Intercourse. North-Holland, Amsterdam.
  • Gamut, L. (1991). Logic, Language, and Meaning.
    Givant, S. and Halmos, P. (2009). Introduction to Boolean Algebras. Springer, New York, NY.
  • Gottschalk, W. H. (1953). The theory of quaternality. Journal of Symbolic Logic, 18:193–196.
  • Gowers, T., editor (2008). The Princeton Companion to Mathematics. Princeton University Press, Princeton, NJ.
    Hacker, E. A. (1975). The octagon of opposition. Notre Dame Journal of Formal Logic, 16:352–353.
  • Henkin, L., Monk, J. D., and Tarski, A. (1971). Cylindric Algebras, Part I. NorthHolland, Amsterdam.
  • Horn, L. (2006). The border wars: A neo-Gricean perspective. In von Heusinger, K. and Turner, K., editors, Where Semantics Meets Pragmatics, pages 21–48. Elsevier, Amsterdam.
  • Horn, L. R. (1989). A Natural History of Negation. University of Chicago Press, Chicago, IL.
  • Horn, L. R. (2004). Implicature. In Horn, L. R. and Ward, G., editors, Handbook of Pragmatics, pages 3–28. Blackwell, Oxford.
  • Hughes, G. E. and Cresswell, M. J. (1996). A New Introduction to Modal Logic. Routledge, London.
  • Humberstone, L. (2011). The Connectives. MIT Press, Cambridge, MA.
  • Iten, C. (1998). Because and although: a case of duality? In Rouchota, V. and Jucker, A. H., editors, Current Issues in Relevance Theory, pages 59–80. John Benjamins, Amsterdam.
  • Iten, C. (2005). Linguistic Meaning, Truth Conditions and Relevance: The Case of Concessives. Palgrave Macmillan, Basingstoke/New York (NY).
  • Jaspers, D. (2005). Operators in the Lexicon. On the Negative Logic of Natural Language. LOT Publications, Utrecht.
  • Johnson, W. (1921). Logic. Part I. Cambridge University Press, Cambridge.
  • Kabakov, F. A., Parkhomenko, A. S., Voitsekhovskii,
    M. I., and Fofanova, T. S. (2014). Duality principle. In Encyclopedia of Mathematics. Springer, available at
    http://www.encyclopediaofmath.org/index.php?title=Duality principle&oldid=35095.
  • Keynes, J. N. (1884). Studies and Exercises in Formal Logic. MacMillan, London.
  • Konig, E. (1991). Concessive relations as the dual of causal relations. In Zaefferer, D., editor, Semantic Universals and Universal Semantics, volume 12 of Groningen-Amsterdam Studies in Semantics, pages 190–209. Foris, Berlin.
  • Kripke, S. (1977). Speaker’s reference and semantic reference. In French, P., Uehling, Jr., T., and Wettstein, H., editors, Contemporary perspectives in the philosophy of language, pages 6–27. University of Minnesota Press, Minneapolis, MN.
  • Libert, T. (2012). Hypercubes of duality. In Beziau, J.-Y. and Jacquette, D., editors, Around and Beyond the Square of Opposition, pages 293–301. Springer, Basel.
  • Löbner, S. (1986). Quantification as a major module. In Groenendijk, J., de Jongh, D., and Stokhof, M., editors, Studies in Discourse Representation Theory and the Theory of Generalized Quantifiers, pages 53–85. Foris, Dordrecht.
  • Löbner, S. (1987). Natural language and generalized quantifier theory. In Gardenfors, P., editor, Generalized Quantifiers, pages 181–201. Reidel, Dordrecht.
  • Löbner, S. (1989). German. schon – erst – noch: an integrated analysis. Linguistics and Philosophy, 12:167–212.
  • Löbner, S. (1990). Wahr neben Falsch. Duale Operatoren als die Quantoren naturlicher Sprache. Max Niemeyer Verlag, Tubingen.
  • Löbner, S. (1999). Why German schon and noch are still duals: a reply to van der Auwera. Linguistics and Philosophy, 22:45–107.
  • Löbner, S. (2011). Dual oppositions in lexical meaning. In Maienborn, C., von Heusinger, K., and Portner, P., editors, Semantics: An International Handbook of Natural Language Meaning, volume I, pages 479–506. de Gruyter Mouton, Berlin.
  • Mac Lane, S. (1998). Categories for the Working Mathematician. Springer, Berlin.
  • Meles, B. (2012). No group of opposition for constructive logics: The intuitionistic and linear cases. In Beziau, J.-Y. and Payette, G., editors, Around and Beyond the Square of Opposition, pages 201–217. Springer, Basel.
  • Michaelis, L. (1996). On the use and meaning of already. Linguistics and Philosophy, 19:477–502.
  • Mittwoch, A. (1993). The relationship between schon/already and noch/still: A reply to Löbner. Natural Language Semantics, 2:71–82.
  • Moretti, A. (2012). Why the logical hexagon? Logica Universalis, 6:69–107.
  • Peters, S. and Westerståhl, D. (2006). ˚ Quantifiers in Language and Logic. Oxford University Press, Oxford.
  • Peterson, P. (1979). On the logic of “few”, “many”, and “most”. Notre Dame Journal of Formal Logic, 20:155–179.
  • Piaget, J. (1949). Traite de logique. Essai de logistique operatoire. Colin/Dunod, Paris.
  • Read, S. (2012). John Buridan’s theory of consequence and his octagons of opposition. In Beziau, J.-Y. and Jacquette, D., editors, ´ Around and Beyond the Square of Opposition, pages 93–110. Springer, Basel.
  • Reichenbach, H. (1952). The syllogism revised. Philosophy of Science, 19:1–16.
  • Schumann, A. (2013). On two squares of opposition: the Lesniewski’s style formalization of synthetic propositions. Acta Analytica, 28:71–93.
  • Seuren, P. and Jaspers, D. (2014). Logico-cognitive structure in the lexicon. Language, 90:607–643.
  • Smessaert, H. (2012). The classical Aristotelian hexagon versus the modern duality hexagon. Logica Universalis, 6:171–199.
  • Smessaert, H. and Demey, L. (2014). Logical geometries and information in the square of oppositions. Journal of Logic, Language and Information, 23:527–565.
  • Smessaert, H. and ter Meulen, A. (2004). Temporal reasoning with aspectual adverbs. Linguistics and Philosophy, 27:209–261.
  • Urquhart, A. (2008). Emil Post. In Gabbay, D. M. and Woods, J., editors, Handbook of the History of Logic. Volume 5. Logic from Russell to Church. Elsevier, Amsterdam.
  • van Benthem, J. (1991). Linguistic universals in logical semantics. In Zaefferer, D., editor, Semantic Universals and Universal Semantics, volume 12 of Groningen-Amsterdam Studies in Semantics, pages 17–36. Foris, Berlin.
  • van der Auwera, J. (1993). ‘Already’ and ‘still’: beyond duality. Linguistics and Philosophy, 16:613–653.
  • Westerståhl, D. (2012). Classical vs. modern squares of opposition, and beyond. In Beziau, J.-Y. and Payette, G., editors, The Square of Opposition. A General Framework for Cognition, pages 195–229. Peter Lang, Bern.

Author Information

Lorenz Demey
Email: lorenz.demey@kuleuven.be
Catholic University of Leuven
Belgium

and

Hans Smessaert
Email: hans.smessaert@kuleuven.be
Catholic University of Leuven
Belgium

The Meaning of Life: Contemporary Analytic Perspectives

Depending on whom one asks, the question, “What is the meaning of life?” is either the most profound question of human existence or else nothing more than a nonsensical request built on conceptual confusion, much like, “What does the color red taste like?” or “What is heavier than the heaviest object?” Ask a non-philosopher, “What do philosophers discuss?” and a likely answer will be, “The meaning of life.” Ask the same question of a philosopher within the analytic tradition, and you will rarely get this answer. The sources of suspicion about the question within analytic philosophy, especially in earlier periods, are varied. First, the question of life’s meaning is conceptually challenging because of terms like “the” “meaning” and “life,” and especially given the grammatical form in which they are arranged. Second, it is often asked with transcendent, spiritual, or religious assumptions at the fore about what the world “should” be like in order for there to be a meaning of life. In so far as the question is entangled with such ideas, the worry is that even if the concept of a meaning of life is coherent, there likely is not one.

Despite such suspicions and relative disinterest in the question of life’s meaning among analytic philosophers for a large part of the twentieth century, there is a growing body of work on the topic over roughly the last two decades. Much of this work focuses on developing and defending theories of meaning in life (see Section 2.d. for more on the distinction between meaning in life and the meaning of life) via conceptual analyses of the necessary and sufficient conditions for meaningful life. A smaller, though no less important, subset of work in this growing field focuses on why we even use “meaning” in the first place to voice our questions and concerns about central facets of the human condition.

This article surveys important trajectories in discussions of life’s meaning within contemporary analytic philosophy. It begins by introducing key aspects of the human context in which the question is asked. The article then investigates three ideas that illumine what meaning means in this context: sense-making, purpose, and significance. The article continues by surveying important topics that provide a greater understanding of what is involved in our requests for meaning. After briefly surveying theories of meaning in life, it concludes with discussions of death and futility, followed by important areas of research that remain under-investigated.

Table of Contents

  1. The Human Context
  2. The Contemporary Analytic Context: Prolegomena
    1. The Meanings of “Meaning”
      1. Sense-Making
      2. Purpose
      3. Significance
    2. The Word “Life”
    3. The Definite Article
    4. Meaning of Life vs. Meaning in Life
    5. What is the Meaning of x?
    6. Interpretive Strategies
      1. The Amalgam Approach
      2. The Single Question Approach
  3. Theories of Meaning in Life
    1. Supernaturalism
    2. Subjective Naturalism
    3. Objective Naturalism
    4. Hybrid Naturalism
    5. Pessimistic Naturalism: Nihilism
    6. Structural Contours of Meaning in Life
  4. Death, Futility, and a Meaningful Life
  5. Underinvestigated Areas
  6. References and Further Reading

1. The Human Context

The human desire for meaning finds vivid expression in the stories we tell, diaries we keep, and in our deepest hopes and fears. According to twentieth century Freudian psychoanalyst Bruno Bettelheim, “our greatest need and most difficult achievement is to find meaning in our lives” (Bettelheim 1978: 3). Holocaust survivor and psychiatrist Viktor Frankl said that the human will to meaning comes prior to either our will to pleasure or will to power (Frankl 2006: 99).

Questions about meaning arise and take shape within varied contexts: when struggling to make an important decision about what to do with our lives, when trapped in a job we hate, when wondering if there is more to life than the daily hum-drum, when diagnosed with a terminal illness, when experiencing the loss of a loved one, when feeling small while looking up at the night sky, when wondering if this universe is all there is and why it is even here in the first place, when questioning whether life and love will have a lasting place in the universe or whether the whole show will end in utter and everlasting desolation and silence.

Lurking behind many of our questions about meaning is our capacity to get outside of ourselves, to view our lives from a wider standpoint, a standpoint from which to understand the setting for our lives and question the “why?” of what we do. Humans possess self-awareness, and can take an observational, self-reflective viewpoint on our lives. In this, we are able to shift from mere automatic engagement to observation and evaluation. We do more than simply respond to streams of stimuli. We step back and question who we are and what we do. Shifting our focus to the widest standpoint—sub specie aeternitatis (literally, from the perspective of eternity; a universal perspective)—we wonder how such infinitesimally small and fleeting creatures like ourselves fit in the grand scheme of things, within vast space and time. We worry about whether a reality of such staggering magnitude, at the deepest level, cares about us (for related discussions, see Fischer 1993; Kahane 2013; Landau 2011; Nagel 1971, 1989; and Seachris 2013).

That our concerns about meaning are often cosmically-focused is instructive. Despite the current theoretical emphasis in analytic philosophy on the more terrestrially-focused idea of meaning in life, questions about meaning are very often cosmic in scope. In the words of sociologist Peter Berger, in seeking life’s meaning, many are attempting to locate it “within a sacred and cosmic frame of reference” of trying to plumb the connection “between microcosm and macrocosm” (Berger 1967: 27). This is an important reason why God, transcendence, and other ideas embodied and expressed in religion are so often thought to be relevant to life’s meaning.

2. The Contemporary Analytic Context: Prolegomena

Relatively speaking, not too long ago many analytic philosophers were suspicious that the question of life’s meaning was incoherent. Such views found expression in popular culture too, for example, in Douglas Adams’ widely read book The Hitchhiker’s Guide to the Galaxy. The story’s central characters visit the legendary planet Magrathea and learn about a race of hyper-intelligent beings who built a computer named Deep Thought. Deep Thought’s purpose was to answer the ultimate question of life, the universe, and everything, that answer being a bewildering 42. Deep Thought explained that this answer was incomprehensible because the beings who designed it, though super-intelligent, did not really know what they were asking in the first place. Asking for life’s meaning might be like this, in which case 42 is as good of an answer as any other.

Some analytic philosophers in the twentieth century, in the wake of logical positivism, shared Deep Thought’s suspicion. They were particularly weary of the traditional formulation—What is the meaning of life? Meaning, it was thought, belongs in the linguistic realm. Words, sentences, and other linguistic constructions are the proper bearers of meaning, not objects, events, or states of affairs, and certainly not life itself. Some philosophers thought that in asking for life’s meaning, we use an ill-chosen expression to voice something real, perhaps an emotional response of awe or wonder at the staggering fact that anything exists at all. Yet, experiencing such feelings and asking a meaningful question are two different things altogether.

Asking what something means, though, need not be a strictly semantic activity. We ask for the meanings of all kinds of things and employ “meaning” in a wide variety of contexts in everyday life, only some of which are narrowly linguistic. Paying careful attention to the meanings of “meaning” provides important clues about what life’s meaning is all about. Three connotations in particular are instructive here: sense-making, purpose, and significance.

a. The Meanings of “Meaning”

Meaning-talk is common in everyday discourse. Most ordinary uses of “meaning” tend to cluster around three basic ideas: (1) sense-making (which can include the ideas of intelligibility, clarification, or coherence), (2) purpose, and (3) significance (which can include the idea of value). The following list of statements and questions captures the richly varied ways in which we employ the concept of meaning on a regular basis.

Meaning as Sense-Making

  1. What you said didn’t mean a thing.
  2. What did you mean by that statement?
  3. Do you know what I mean?
  4. What did you mean by that face? (overlaps with purpose)
  5. What is the meaning of that book? (what is it about?)
  6. What is the meaning of this? (for example, when asked upon returning home to find one’s house ransacked)

Meaning as Purpose

  1. What did you mean by that face? (overlaps with intelligibility)
  2. The tantrum is meant to catch his dad’s attention.
  3. What is the meaning of that book? (why was it written?)
  4. I really mean it!
  5. I didn’t mean to do it. I promise!

Meaning as Significance

  1. That was such a meaningful
  2. This watch really means something to me.
  3. That is a highly meaningful event in the life of that city.
  4. What do his first six months in office mean for the country (likely overlaps with intelligibility)
  5. That is a meaningful
  6. That is a meaningless
  7. You mean nothing to me.

i. Sense-Making

This category is an important ordinary sense of meaning and connotes ideas like intelligibility, clarification, and coherence. Something has meaning if it makes sense; it lacks meaning if it does not. One way of understanding sense-making is through the idea of proper fit. Words, concepts, propositions, but also events and states of affairs, make sense and are meaningful if and when they fit together properly; if they lack such fit, they make no sense and are meaningless. This applies narrowly. For example, it makes no sense to ask, “What is brighter than the brightest light source?” It does not fit with the concept brightest to ask what is brighter, but it has a broader application too. We say things like:

  1. It does not make sense for the president to send in troops given the geopolitical situation in the region.
  2. Asking philosophy students to perform long-division on their midterm makes no sense.

In each of these situations, we perceive a lack of fit—a lack of fit between a decision and circumstances surrounding that decision or between reasonable expectations about what one will find on a philosophy exam and what one actually finds. There is a kind of absurdity here. Perceiving this weaker lack of fit will be a product of beliefs, norms, and other epistemic, evaluative, and social commitments. Therefore, determining whether or not something, in fact, involves a lack of fit in this broader sense often will be a messier task than in cases of narrow sense-making.

Ascertaining meaning, then, is often about fitting something into a larger context or whole: words into sentences, paragraphs, novels, or monographs; musical notes into measures, movements, and symphonies (i.e., the movement from mere sound to music), parts of a photograph within the entire photograph. Meaning is about intelligibility within a wider frame, about “inserting small parts into a larger, integrated context” (Svendsen 2005: 29). Similarly, we can plausibly view our requests for the meaning of life as attempts to secure the overarching context through which to make sense of our lives in the universe (see Thomson 2003: 132-138). Our focus here is on existentially weighty matters that define and depict the human condition: questions and concerns surrounding origins, purpose, significance, value, suffering, and death and destiny. We want answers to our questions about these matters, and want these answers to fit together in an existentially satisfying way. We want life to make sense, and when it does not, we are haunted by the specter of meaninglessness.

ii. Purpose

Requests for meaning are very often requests for purpose. We want to know whether we have a purpose(s) and if so, what it is. Many assume that there is a cosmic purpose around which to order our lives. A cosmic purpose likely would require transcendence or God. Someone must intend it all in order for there to be a purpose of it all. One might reject the idea of cosmic purpose, though, and still frame the question about life’s meaning as one largely about purpose. In this case, meaningful life (or meaning in life) is about ordering one’s life around self-determined purposes.

We also distinguish actions done on purpose from those done by accident. We use meaning (or meant) to contrast willful from non-willful action. We say things like, “I really mean it” to indicate the ‘full’ operation of our will. Alternatively, our child might say, “I didn’t mean it, I promise!” to indicate that she did not intend to spill her glass of milk. This sense of “meant” is also relevant for life’s meaning. We want sufficient autonomy, and when it is absent or severely mitigated, we worry about the meaningfulness of our lives (see Mawson 2016; Sartre 1973). Most of us do not want to walk through life haphazardly, nor in a way that is largely determined apart from our own consent. Likely one aspect of meaningful life, then, is life lived with our wills sufficiently engaged, one lived on purpose. These two shades of purpose are probably related. We want to really mean it as we select and align our lives with aims that will provide the salient structural rhythms to our day-to-day existence. In other words, we do not want to be alienated from the purposes that guide our lives.

Purpose and sense-making often are connected. Purpose itself, via future-targeted goals that shape pre-goal activity, provide important aspects of the structure that serves as the framework through which life fits together and makes sense. Lives that fit together and make sense—meaningful lives—are those that are sufficiently teleological. Working to attain goals at various levels of life-centrality is likely a facet of life properly fitting together and therefore being meaningful. Teleological threads connecting discreet life episodes are then necessary for a robust kind of sense-making in life. Lives lacking this are threatened with a sort of unintelligibility that results from being insufficiently structured by a telos. In the words of philosopher Alasdair MacIntyre:

When someone complains…that his or her life is meaningless, he or she is often and perhaps characteristically complaining that the narrative of their life has become unintelligible to them, that it lacks any point, any movement toward a climax or a telos (MacIntyre 2007: 217).

iii. Significance

Meaning often conveys the idea of significance, and significance tracks a related cluster of notions like mattering, importance, impact, salience, being the object of care and concern, and value, depending on context. We contrast trivial discussions about the mundane with deep discussions about important matters, referring to the latter as meaningful or significant. Physical objects deeply enmeshed in our life stories are meaningful. We view actions and events that have salient implications as significant, and in cases where that significance has positive value, as meaningful (whether a person can lead a meaningful life in virtue of making large negative impacts is a growing topic of discussion as the field seeks to understand the connection between meaning and morality; see Campbell and Nyholm 2015). Finding the cure for that disease was meaningful because it had such a large positive impact within a certain frame of cares and concerns. This shade of meaning is also in view in cases where some piece or set of data crosses a threshold of salience against background information. That such a large percentage of the population living under certain conditions is getting a particular disease is statistically significant or statistically meaningful. In this way, sense-making and significance senses of meaning connect.

Alternatively, when something does not matter to us, we might say, “That means nothing to me.” It was just a meaningless conversation; it was inconsequential. That game did not matter because the playoffs were already set. The wrapping paper does not matter, what is on the inside of the package counts. That piece of information is not meaningful relevant to the aims and questions guiding one’s inquiry. Spending your life sitting on the couch and watching sitcom re-runs on Netflix is meaningless; you do nothing that matters, you do nothing of importance or value, and so on.

Something’s significance is often and largely gauged in relation to a perspective, horizon, or point of reference, all of which can be dynamic. Something that is significant from one vantage point may, and often does, lose its significance when viewed from a broader horizon. Scraping your knee at age four is significant, at least from a four-year old’s perspective. When looking back decades later, its significance wanes. Most events important enough to make it into local lore will not matter enough to be included in a national history, let alone world and, especially, cosmic history. One quickly sees resources available from which to generate pessimistic meaning of life concerns vis-à-vis human significance as one broadens horizons, eventually terminating in the widest cosmic perspective.

Significance is often distinctly normative and person-al. When we say that something is meaningful in the sense of being significant, important, or mattering, we make a kind of evaluative claim about what is good or valuable. Additionally, significance is often connected with being the object of a person’s evaluations, cares, and concerns. Things are, most naturally, significant to someone.

Insofar as meaning is thought to have an affective dimension, that dimension likely intersects with significance. If my grandmother’s necklace is meaningful to me, it has value, it matters, and affective states fitting a certain psychological profile, like being deeply stirred or moved, often accompany such assessments of value and mattering. Though this may not make such affective states a further type of meaning or constitutive of meaning, these states reliably track instances of significance or perceived significance.

Like sense-making and purpose, significance is relevant to life’s meaning. In broad terms, one way of construing meaningful life is as a life that matters and has positive value. This, of course, admits of various understandings of mattering that, at one level, might track the objective naturalist, subjective naturalist, hybrid naturalist, and supernaturalist debate (see Section 3 below): matters to whom and according to what standard? Additionally, some find it difficult to separate personal and cosmic concerns over significance. Cosmic concerns, for many, are also intensely personal. If the universe as a whole lacks significance, some worry that their individual lives lack significance, or at least the kind that they think a deeply meaningful life requires.

b. The Word “Life”

Understanding what life’s meaning is all about is complicated, not just because of the expansive semantic range of “meaning,” but also because it is not immediately clear how we should understand the word “life” in the question. In asking for life’s meaning, we are not, at least most of us, asking for the meaning of the word “life.” Neither are we asking about how being alive is different from being non-living or how being organic is different from being inorganic. What then are we asking, and what is the scope of that request? Our question(s) about life’s meaning likely range over the following options:

Life1 = individual human life (meaning of my life)

Life2 = humanity as a whole (meaning of human existence)

Life3 = all biological life (meaning of all living organisms collectively)

Life4 = all of space-time existence (meaning of it all)

Life5 = rough marker for those aspects of human life that have a kind of existential gravitas and are of immense concern and the subject of intense questioning by human beings (see Section 2.e. below)

Each of these options for understanding “life” in the traditional formulation tracks possible interpretations of the question. The targets of our questions and concerns about meaning are varied in scope. We ask questions about our own, personal existence as well as questions about the entire show, and one might think that questions about personal meaning are connected to questions about cosmic meaning. Life5 provides a way of bringing important aspects of each together (see Section 2.e.)

c. The Definite Article

Another thorny issue for the traditional formulation is its incorporation of the definite article—the. It implies that there is only one meaning of life, which violates common inclinations that meaning is the sort of thing that varies from person to person. What makes one life meaningful is different from what makes another meaningful. One person might derive large doses of meaning from her career, another through gardening. For this reason, many are suspicious of the definite article.

There is good reason, though, to question this suspicion. First, it might reveal confusion about what meaning even is in the first place. Indeed, one of the aims of those working in the field is to clarify just what meaning is. Here, it is worth noting that many plausible theories of meaning have an objective component, indicating that not just anything goes for meaning. However, even if meaning were solely a matter of, say, being fulfilled, notice that the following two claims are still consistent: (1) the meaning of life is about being fulfilled and (2) sources of fulfillment are exceedingly diverse. Life’s meaning in this case is about being fulfilled (consistent across persons), but sources of fulfillment vary from person to person.

Second, one might also reasonably think that there is a single meaning of life at the cosmic level that itself is consistent with a rich variety of ways to lead a meaningful life (meaning in life at the terrestrial, personal level). Thinking through possibilities like this will connect with claims about what is true about the world, for example, whether there is a God with a plan for the cosmos and whether there is an overarching meaning to it all. In a case like this, there might be a single meaning of life, but the sense of meaning in which there is a single meaning could be different from the sense of meaning in which there are varied meanings. Regardless of the complexities here, the point is that one should not too quickly dismiss the definite article as contributing to intractable theoretical and practical problems for thinking about life’s meaning.

d. Meaning of Life vs. Meaning in Life

In what has become a standard distinction in the field, philosophers distinguish two ideas: the meaning of life (MofL) and meaning in life (MinL). Claims like the following are prevalent, “one can find meaning in her life, even if there is no grand, cosmic meaning of life.” MofL is more global or cosmic in scope, and often is intertwined with ideas like God, transcendence, religion, or a spiritual, sacred realm. In asking for life’s meaning, one is often asking for some sort of cosmic meaning, though she may also be asking for the meaning of her individual life from the perspective of the cosmos since many think the meaning of their individual lives is tied to whether or not there is a meaning of it all.

MinL is focused on personal meaning; the meaning of our individual lives as located in the web of human endeavors and relationships sub specie humanitatis—within the frame of human cares and concerns. Many think that we can legitimately talk about life having meaning in this sense regardless of what is true about the meaning of the universe as a whole.

One can see how the various sense of meaning discussed earlier in this entry intersect at both levels—MofL and MinL. For example, if sense-making is in view at the cosmic level, we might ask questions like the following: “What’s it all about?” or “How does it all fit together?” At the terrestrial, personal level, our sense-making questions might, rather, take the following shape: “What is my life about?” “How does my life fit together?” or “Is my life coherent?” If significance is in view at the cosmic level, we might ask, “Do our lives really matter in the grand scheme of things?” whereas terrestrially, personally, we might ask, “Does my life matter to me, my family, friends, or my community?”

e. What is the Meaning of x?

The locution, “What is the meaning of x?” need not be understood narrowly as the request for something semantic, say, for a definition or description. There are additional non-linguistic contexts in which this request makes perfect sense (see Nozick 1981). Some of them even share striking similarities to the question of life’s meaning. One in particular is especially relevant.

Sometimes we are confronted with circumstances that we do not yet sufficiently understand, in which case we might naturally respond by asking, “What’s this all about?” or “What’s going on here?” or “What happened?” or “What’s happening?” or “What does this mean?” or “What is the meaning of this?” In asking such questions, we are in search of sense-making and intelligibility. We walk in on our children fighting and demand: “What is the meaning of this?” Mary Magdalene and Mary the mother of James come to find a stone rolled away from a Roman guarded tomb. The burial linens are there, but Jesus’ body is nowhere to be found. One can imagine them thinking, “What is the meaning of this?”

We naturally invoke the formula “What is the meaning of x?” in situations where x is some fact, event, phenomena, or cluster of such things, and about which we want to know, in the words of New Testament scholar and theologian, N. T. Wright, its “implication in the wider world within which this notion makes the sense it makes” (Wright 2003: 719). Such requests track our desire to make sense of a situation, to render it intelligible with the further aim of acting appropriately in response—a kind of epistemic map to aid in practical, normative navigation.

Taking our cue from these ordinary examples, to inquire about life’s meaning is plausibly understood as asking something similar to our requests for the meaning of our children’s scuffle or of Jesus’ empty tomb. Over the course of our existence, we encounter aspects of the world that have a kind of existential gravitas in virtue of their role in defining and depicting the human condition. They capture our attention in a unique way. The word “life,” then, is a rough marker for these existentially-weighty aspects (Life5 in Section 2.b. above), aspects of life that give rise to profound questions for which we seek an explanatory framework (perhaps even a narrative framework) in order to make sense of them. These aspects of the world are akin to the portion of the scuffle and empty tomb above to which we already have limited informational access: yelling and throwing in the case of the scuffle, and the various pieces and clues observed at the empty tomb. Like the parent or Mary Magdalene in those situations, we lack important parts of life’s context, and we desire to fill in these existentially relevant gaps in our knowledge, and then live accordingly. We are in search of life’s meaning, where that meaning is, at center, a kind of overarching sense-making framework for answering and fitting together answers to our questions about origins, purpose, significance, value, suffering, and destiny.

f. Interpretive Strategies

i. The Amalgam Approach

The currently favored strategy for interpreting the traditional formulation of the question—What is the meaning of life?—is the amalgam approach. On this pluralist view, the question is not thought to be a single question at all, but rather an amalgam of numerous other questions, most of which share family resemblances. The question is, on this view, simply a place-holder (some think ill-conceived) for these other questions and is, itself, not capable of being answered in this form. Though it has no answer in this form, other questions about purpose, significance, value, worth, origins, and destiny might. We at least know what we ask when we ask them, so the thought goes. Suspicion of the traditional formulation often accompanies the amalgam view since that formulation makes use of the definite article (“the”), the word “meaning,” and the word “life,” which together in the grammatical form in which they are found contribute to a thorny interpretive challenge. Perhaps the best strategy according to many proponents of the amalgam interpretation, is simply to jettison the traditional formulation and focus on trying to answer some among this other cluster of questions that collectively embody what we are concerned about when we inquire into life’s meaning.

ii. The Single Question Approach

Though the amalgam interpretation is the most popular view among those writing on life’s meaning within analytic philosophy, a few others have favored an approach that views the traditional formulation as a single question capable of being answered in that form (see Seachris 2009, 2019; Thomson 2003). A promising strategy here is to prioritize the sense-making connotation of meaning. On this version of the interpretive approach, asking about the meaning of life is first about seeking a sense-making explanation (perhaps even narrative explanation) for our questions and concerns about origins, purpose, significance, value, suffering, and destiny. Contrary to the amalgam interpretation, on this view, the question of life’s meaning is asking for a single thing—a sense-making explanation. It is, of course, an explanation squarely focused on all this other meaning of life “stuff.”  This explanation can be thought of as a worldview or metanarrative. This approach is an organic interpretive strategy that seeks a single answer (e.g., narrative explanation) that unifies or integrates answers to all the sub-questions that define and depict the human condition. It provides the conceptual resources to account for both MofL and MinL. The cosmic and the personal, the epistemic and the normative, and the theoretical and the practical are inseparable in our search for meaning. The sense-making framework that we seek links all of this as we pursue meaningful lives in light of our place within the grand scheme of it all.

This version of the single-question approach, with its emphasis on sense-making, is closely related to the concept of worldview. Worldviews provide answers to the existentially weighty set of questions that brings into relief the human condition. As philosopher Milton Munitz notes:

. . . [people] may say that what they are looking for [when asking the question of life’s meaning] is an account of the “big picture” with whose aid they would be able to see not only their own individual personal lives, but the lives of everybody else, indeed of everything of a finite or limited sort, human or not. . . . The expression of such a concern involves, at bottom, the appeal to a “worldview” or “world picture.” This undertakes to give a description of the most inclusive setting within which human life is situated . . . (Munitz 1993: 30).

To offer a worldview, then, is to offer a putative meaning of life—a sense-making framework focused squarely on the set of questions and concerns surrounding origins, purpose, significance, value, suffering, and destiny.

Looking back further into the origin of the worldview concept strengthens the connection between worldview and life’s meaning, and offers important clues that a worldview provides a kind of sense-making meaning. Nineteenth century German historian and philosopher, Wilhelm Dilthey, spoke of a worldview as a concept that “. . . constitutes an overall perspective on life that sums up what we know about the world, how we evaluate it emotionally, and how we respond to it volitionally.” Worldviews possess three distinct yet interrelated dimensions: cognitive, affective, and practical. They address both MofL and MinL. A worldview is motivated out of a desire to answer what he calls the “riddle of existence:”

The riddle of existence faces all ages of mankind with the same mysterious countenance; we catch sight of its features, but we must guess at the soul behind it. This riddle is always bound up organically with that of the world itself and with the question what I am supposed to do in this world, why I am in it, and how my life in it will end. Where did I come from? Why do I exist? What will become of me? This is the most general question of all questions and the one that most concerns me (Dilthey 1980: 81-82).

Dilthey’s cluster of questions that motivate worldview construction are those same questions to which we want answers in seeking life’s meaning. In this way, life’s meaning might just be a sense-making framework. It is not a stretch to say that life’s meaning is that which worldview’s aim to provide.

3. Theories of Meaning in Life

Beyond important preliminary discussions over the nature of the question itself and its constituent parts, one will find competing theories of meaning in life. Here, the debate is over the question of what makes a person’s life meaningful, not over the question of whether there is a cosmic meaning of it all (though, again, some think the two cannot be so easily disentangled). The four most influential views of meaning in life are: (1) Supernaturalism, (2) Objective Naturalism, (3) Subjective Naturalism, and (4) Hybrid Naturalism. (5) Nihilism is not a theory of meaning, rather, it is the denial of meaning, whether cosmic or personal. Objective, subjective, and hybrid naturalism are all optimistic forms of naturalism. They allow for the possibility of a meaningful existence in a world devoid of finite and infinite spiritual realities. Pessimistic naturalism, or what is commonly called “nihilism,” is generally, though not always, thought to be an implication of an entirely naturalistic ontology, though vigorous debate exits about whether naturalism entails nihilism.

a. Supernaturalism

Roughly, supernaturalism maintains that God’s existence, along with “appropriately relating” to God, is necessary and sufficient for securing a meaningful life, although accounts diverge on the specifics. Among countless others, historic representatives of supernaturalism in the Near-Eastern ancient world and in subsequent history include Qoheleth (the one called “Teacher” in the Old Testament book of Ecclesiastes), Jesus, the Apostle Paul, Augustine, Aquinas, Jonathan Edwards, Blaise Pascal, Leo Tolstoy, C. S. Lewis, and many contemporary analytic philosophers.

Meaningful life, on supernaturalism, consists of claims along metaphysical, epistemological, and relational-axiological axes. Metaphysically, meaningful life requires God’s existence because, for example, conditions that ground properties necessary for meaning like objective value are thought to be most plausibly anchored in a being like God (See Cottingham 2005; Craig 2008). It also requires, at some level orthodoxy (right belief) and orthopraxy (right life and practice), though again, much debate exists on the details. In addition to God’s existence, meaning in life requires that a person be appropriately related to God, perhaps as expressed in one’s beliefs and especially in one’s devotion, worship, and the quality of her life lived with and among others as, for example, embodied in Jesus’ statement of the greatest commandments (cf. Matt. 22:34-40).

Pascal captures the spirit of supernaturalism in this passage from the Pensées:

What else does this craving, and this helplessness, proclaim but that there was once in man a true happiness, of which all that now remains is the empty print and trace? This he tries in vain to fill with everything around him, seeking in things that are not there the help he cannot find in those that are, though none can help, since this infinite abyss can be filled only with an infinite and immutable object; in other words by God himself (Pascal 1995: 45).

As does St. Augustine at the beginning of his Confessions:

. . . you have made us for yourself, and our heart is restless until it rests in you (St. Augustine 1963: 17).

It is worth noting that there are versions of supernaturalism that do not view God as necessary for meaningful life, but nonetheless claim that God and relating to God in appropriate ways would significantly enhance meaning in life. This more moderate form of supernaturalism allows for the possibility of meaningful life, in some measure, on naturalism (see Metz 2019 for a helpful taxonomy of the conceptual space here).

Supernaturalist views, whether stronger or more moderate, connect with questions and concerns about the problem of evil, post-mortem survival, and ultimate justice. It is often thought that a being like God is needed to “author and direct” the narrative of the universe, and, in some sense, the narratives of our individual lives to a good and blessed ending (involving both closure and teleological senses of ending, though not an absolute termination sense; see Seachris 2011). Many worry that, on naturalism, life does not make sense or is absurd (a kind of sense-making meaning; see Section 2.a.i. above) if there is no ultimate justice and redemption for the ills of this world, and if the last word is death and dissolution, followed by silence, forever.

b. Subjective Naturalism

Subjective naturalism is an optimistic naturalistic view in claiming that life can be robustly meaningful even if there is no God, after-life, or transcendent realm. In this, it is like objective and hybrid forms of naturalism. According to subjective naturalism, what constitutes a meaningful life varies from person to person, and is a function of one getting what one strongly wants or by achieving self-established goals or through accomplishing what one believes to be really important. Caring about or loving something deeply has been thought by some to confer meaning in life (see Frankfurt 1988). Some subjectivist views focus on affective states of a certain psychological profile, like fulfillment or satisfaction for example, as constituting the essence of meaningful life (see Taylor 1967). Subjectivism is appealing to some in light of perceived failures to ground objective value, either naturally, non-naturally, or supernaturally, and in accounting for the widespread view that meaning and fulfillment are closely connected.

A worry for subjective naturalism, analogous to ethical worries about moral relativism, is that this view is too permissive, allowing for bizarre or even immoral activities to ground meaning in life. Many protest that surely deep care and love, by themselves, are not sufficient to confer meaningfulness in life. What if someone claims to find meaning by measuring and re-measuring blades of grass or memorizing the entire catalogue of Netflix shows or, worse, torturing people for fun? Can a life centering on such pursuits be meaningful? A strong, widespread intuition here inclines many towards requiring a condition of objective value or worth on meaning. Subjectivism still has thoughtful defenders, though, with some proposals moving towards grounding value inter-subjectively—in community and its shared values—as opposed to in the individual exclusively. It is also worth noting that one could be a subjectivist about meaning while being an objectivist about morality. In this way, a fulfilled torturer might lead a meaningful, though immoral life. Meaning and morality, on this view, are distinct values that can, in principle, come into conflict.

c. Objective Naturalism

 Objective naturalism, like subjective naturalism, posits that a meaningful life is possible in a purely physical world devoid of finite and infinite spiritual realities. It differs, though, in what is required for meaning in life. Objective naturalists claim that a meaningful life is a function of appropriately connecting with mind-independent realities of objective worth (contra subjectivism), and that are entirely natural (contra supernaturalism). Theories differ on the nature of this connection. Some require mere orientation around objective value, while others require a stronger causal connection with good outcomes (see Smuts 2013). Again, objective naturalism is distinguished from subjective naturalism by its emphasis on mind-independent, objective value. One way of putting the point is to say that wanting or choosing is insufficient for a meaningful life. For example, choosing to spend one’s waking hours memorizing the inventory of one’s local Target store, even if this activity results in fulfillment, is likely insufficient for meaning on objective naturalism. Rather, meaning is a function of linking one’s life to objectively valuable, mind-independent conditions that are not themselves the sole products of what one wants and chooses. On objective naturalism it is possible to be wrong about what confers meaning on life—something is meaningful, at least partly, in virtue of its intrinsic nature, irrespective of what is believed about it. This is why spending salient portions of one’s life memorizing department store inventories is not meaningful on objective naturalism, even if the person strongly desires to do this.

One worry for objective naturalism is that it may have a harder time accounting for cases of neural atypicality, for example, a person with ASD who is deeply fulfilled by activities that seem to lack intrinsic value or worth. Does a person who is not a plumber and for whom pipes and interactions with pipes provide salient goals, a kind of coherence to his life, and enjoyable experiences fail to acquire meaning because it all largely revolves around a fascination with pipes? Might subjectivist views better account for the lives of those among us whose interests and interactions with the world are strikingly different, and for whom such interests are the result of neural atypicality?

Critics of objective naturalism might also press the point that proponents of this view conflate meaning and morality or at least conflate important aspects of these two putatively different kinds of value. One value might be objectively shaped, whereas the other might not.

d. Hybrid Naturalism

Many researchers think that there is something right about both objectivist and subjectivist views, but that each on its own is incomplete. Susan Wolf has developed what has come to be one of the more influential theories of meaning in life over the last decade or so, the fitting-fulfillment view. Her view includes both objective and subjective conditions, and is captured by the slogan, “Meaning arises when subjective attraction meets objective attractiveness” (Wolf 1997: 211). Meaning is not present in a life spent believing in, being fulfilled by, or caring about worthless projects, but neither is it present in a life spent engaging in worthwhile, objectively valuable projects without also believing in, being fulfilled by, or caring about them. Many think hybridist views capture what is best about objectivism and subjectivism while avoiding the pitfalls of each.

In their naturalistic forms, such theories of meaning are inconsistent with supernaturalism. However, one can imagine supernaturalist forms of each of these views. One might be a supernaturalist who thinks that meaning wholly or largely consists in subjective fulfillment in the Divine—a kind of subjectivism, or that meaning consists in orientation around objective value, again grounded in the Divine—a kind of objectivism. One could also formulate distinctly supernaturalist hybrid views.

e. Pessimistic Naturalism: Nihilism

In opposition to all optimistic views about the possibility of meaningful life, is pessimistic naturalism, more commonly called nihilism. Roughly, nihilism is the view that denies that a meaningful life is possible because, literally, nothing has any value. Nihilism may be understood as a combination of theses and assumptions drawn from both supernaturalism and naturalism: (i) God or some supernatural realm is likely necessary for value and a meaningful life, but (ii) no such entity or realm exists, and therefore (3) nothing is ultimately of  value and there is, therefore, no meaning. Other forms of nihilism focus on states like boredom or dissatisfaction, arguing that boredom sufficiently characterizes life so as to make it meaningless, or that human lives lack the requisite amount of satisfaction to confer meaning upon them.

f. Structural Contours of Meaning in Life

If meaning is a distinct kind of value that a life can have, and if the three senses of meaning above (see Section 2.a. above) capture the range of ideas encompassed by meaning, then these ideas can help illumine the conceptual shape of meaning in life. Each of the ordinary senses of “meaning” provides strategies for conceptualizing the broad structural contours of meaningful life.

Sense-making: An intelligible life; one that makes sense (broad sense-making), that fits together properly, and exhibits a kind of coherence (for example, relationally, vocationally, morally, spiritually, and so on), perhaps even narrative coherence.

Purpose: A life saliently oriented around purposes, goals, and aims, and lived on purpose in which the person’s autonomy is sufficiently engaged.

Significance: A life that matters (and has positive value)—intrinsically in virtue of the kind of life that it is and extrinsically in virtue of its implications and impacts, especially within the narrow (e.g., familial) and broad (e.g., cultural) relational webs of which the person is a part.

Though one can view these as largely different ways of thinking about what a meaningful life is, one might think that there is a more organic relationship between them. Here is one strategy through which all three senses of meaning might coalesce and bring into relief the full structural contours of meaningful life in a unified way:

Meaningful Life = A life that makes sense, that fits together properly (sense-making) in virtue of appropriate orientation around goals (purpose), other (atelic) activities (see Setiya 2017), and relationships that matter and have positive value (significance).

Philosophers may want to follow social scientists here in thinking more about this tripartite conception of meaning. Psychologists, for example, are increasingly using similar accounts in experimental design and testing. One prominent psychologist working in the area of meaning proposes a definition of meaning in life that incorporates a similar triad that prioritizes sense-making:

Meaning is the web of connections, understandings, and interpretations that help us comprehend our experience and formulate plans directing our energies to the achievement of our desired future. Meaning provides us with the sense that our lives matter, that they make sense, and that they are more than the sum of our seconds, days, and years (Steger 2012: 165).

4. Death, Futility, and a Meaningful Life

Life’s meaning is closely linked with a cluster of related issues including death, futility, and endings in general. These are important themes in the literature on meaning, and are found in a wide array of sources ranging from the Old Testament book of Ecclesiastes to Tolstoy to Camus to contemporary analytic writing on the topic. Worries that death, as conceived on naturalism, threatens meaning lead into discussions about futility. It is a commonly held view that life is futile if all we are and do eventually comes to nothing. If naturalism is true and death is the end . . . period . . . then life is futile, so the argument goes. Left undeveloped, it is not entirely clear what people mean by this, though the sentiment behind the idea is intense and prevalent.

In order to explore the worry further, it is important to get clearer on what is meant by futility. In ordinary cases, something is futile when the accomplishment or fulfillment of what is aimed at or desired is impossible. Examples of futility include:

It is futile for a human being to try to both exist and not exist at the same time and in the same sense.

It is futile to try and jump to Mars.

It is futile to try and write an entire, 300-page novel, from start to finish, in one hour.

On the preceding account of futility, the existential angst that accompanies some instance of futility is proportional to how one feels about what it is that is futile. The extent to which one is invested—for example, emotionally and relationally—in attempting to reach some desired end will affect how she responds to real or perceived futility (“perceived” because one could be wrong about whether or not something is, in fact, futile). Imagine that a person has a curiosity to experience flying as a falcon flies. It would be futile to attempt to fly as a falcon flies. Though this person might be minimally distressed as a result of not being able to experience this, it is doubtful he would experience soul-crushing angst. Contrast this with a situation where one has trained for years to run an ironman triathlon, but one week prior to the event, she is paralyzed from the neck down in a tragic automobile accident. To now try and compete in the triathlon without mechanical assistance would be futile. Given the importance of this goal in the person’s life, she would appropriately feel significant existential angst at not being able to compete. Years of training would be unrewarded. Deep hopes would be dashed. A central life goal is now forever unfulfilled. The level of existential angst accompanying futility, then, is proportional to the level of one’s investment in some desired end and the relative desirability of that end.

The preceding analysis is relevant to futility and life’s meaning. What might people have in mind when they say that life itself is futile if naturalism is true and death is the last word of our lives and the universe? The discrepancy here from which a sense of futility emerges is between central longings of the human heart and a world devoid of God and an afterlife, which is a world incapable of fulfilling such longings. There is a stark incongruity between what we really want (even what we might say we need) and a completely and utterly silent universe that does not care. There is also a discrepancy between the final state of affairs where quite literally nothing matters, and the current state of affairs where many things seem to matter (e.g., relationships, personal and cultural achievements, and scientific advancements, among others). It seems hard to fathom that things with such existential gravitas are but a vapor in the grand scheme of things. We might also call this absurd, since absurdity and futility are connected, both of which are partly encapsulated in the idea of a profound incongruity or lack of fit.

Futility, in this way, connects to hope and expectations about fulfillment and longevity. In some circumstances, we are inclined to think that something is characterized by futility if it does not last as long as we think it should last given the kind of thing that it is. If you spend half a day building a snow fort and your children destroy it in five minutes, you will be inclined to think that your efforts were futile even though you accomplished your goal of building the fort. You will not, however, think your efforts were futile if the fort lasts a few days and provides you and your children with several fun adventures and a classic snowball fight. It needs to last long enough to serve its purpose.

Some say that an average human lifetime with average human experiences is sufficient to satiate core human longings and for us to accomplish central purposes (see Trisel 2004). Others, however, think that only eternity is long enough to do justice to those aspects of the human condition of superlative value, primarily and especially, happiness and love, the latter understood roughly as commitment to the true good or well-being of another. Some things are of such sublime character that for them to be extinguished, even after eons upon eons, is truly tragic, so the thinking goes. Anything less than forever is less than enough time, and leads to a sense of futility. We want the most important things in life—especially happiness, love and relationships—to last indefinitely. But if naturalism is true, all will be dissolved in the death of ourselves and the universe; it will be as if none of this ever happened. If the important stuff of life that we are so invested in lasts only a short while, many worry that life itself is deeply and ultimately futile.

Futility, then, is sometimes linked with how something ends. With life’s meaning in view, many worry that its meaning is jeopardized if, in the end, all comes indelibly to naught. Such worries have been articulated in what some call Final Outcome Arguments (see Wielenberg 2006). A final outcome argument is one whose conclusion is that life is somewhat or wholly meaningless or absurd or futile because of a “bad” ending. Such arguments can have weaker and stronger conclusions, ranging from a “bad” ending only slightly mitigating meaning all the way to completely destroying meaning. What they all have in common, however, is that they give the ending an important say in evaluating life’s meaning.

Why think that endings have such power? Many have argued that giving them this power arbitrarily privileges the future over the past. Thomas Nagel once said that “. . . it does not matter now that in a million years nothing we do now will matter” (Nagel 1971: 716). Why should we think the future is more important than, or relevant at all to the past and the present? But perhaps Nagel is mistaken. There may, in fact, be good reasons to think that how life ends is relevant for evaluating its meaning (see Seachris 2011). Whichever conclusion one adopts, principled reasons must be offered to settle the question of which viewpoint—the distant future or the immediate present—takes priority in appraisals of life’s meaning.

5. Underinvestigated Areas

Within value theory, an under-investigated area is how meaning fits within the overall normative landscape. How is it connected, if at all, with ethical, aesthetic, and eudaimonistic value, for example? What sorts of relationships, conceptual, causal or otherwise, exist between the various values? Do some reduce to others? Can profoundly unethical lives still count as meaningful? What about profoundly unhappy lives? These and other questions are on the table as a growing number of researchers investigate them.

Another area in need of increased attention is the relationship between meaning and suffering. Suffering intersects with our attempts to make sense of our lives in this universe, motivates our questions about why we are here, and gives rise to our concerns about whether or not we ultimately matter. We wonder if there is an intelligible, existentially satisfying narrative in which to locate—make sense of—our visceral experience of suffering, and to give us solace and hope. Evil in a meaningful universe does not cease from being evil, but it can be more bearable within these hospitable conditions. Perhaps the problem of meaning is more fundamental than the problem of evil. Also relevant is what can be called the eschatological dimension of the problem of evil—is there any hope in the face of pain, suffering, and death, and if so, in what does this hope consist? Addressing future-oriented considerations of suffering will naturally link to perennial meaning of life topics like death and futility. Additionally, it will motivate further discussion over whether the inherent human desire for a felicitous ending to life’s narrative, including, for example, post-mortem survival and enjoyment of the beatific vision or some other blessed state is mere wishful thinking or a cousin to our desire for water, and thus, a truly natural desire that points to an object capable of fulfilling it.

Equally under-investigated is how the concept of narrative (and meta-narrative) might shed light on the meaning of life, and especially what talk of life’s meaning is often all about. Historically, most of the satisfying narratives that in some way narrated the meaning of life were also religious or quasi-religious. Additionally, many of these narratives count as narratives in the paradigmatic sense as opposed to non-narrative modes of discourse. However, with the rise of naturalism in the West, these narratives and the religious or quasi-religious worldviews embedded within them, began to lose traction in certain sectors. Out of this milieu emerged more angst-laden questioning of life’s meaning accompanied by the fear that a naturalistic meta-narrative of the universe fails to be existentially satisfying. More work is needed by cognitive scientists, theologians, and philosophers on our narrative proclivities as human beings, and how these proclivities shape and illumine our pursuit of meaning.

Finally, a number of pressing practical and ethical questions, especially focusing on marginalized populations, deserve more careful attention. For example, how might the actual lives and experiences of persons with disabilities inform and constrain theories of meaning in life? Do their lives call into question certain theories of meaning? What does the practice of solitary confinement reveal about the human need of meaning? Does the profound lack of meaning in such circumstances provide a reason to impose stricter limitations on its use? How might the human need for meaning (see Bettelhiem 1978; Frankl 2006) be leveraged to understand and then address systemic societal issues like homelessness and opioid addiction? How can understanding seemingly pathological expressions of our yearning for meaning help make sense of and respond to nationalism and terrorism?

Analytic philosophy, once deeply skeptical of and indifferent to the meaning of life, is now the source of important and interesting new theorization on the topic. There is even something of a subfield emerging, consisting of researchers devoting significant time and energy to understanding conceptual and practical aspects of life’s meaning. The topic is being approached with an analytic rigor that is leading to progress and opening exciting avenues for promising new breakthroughs. The philosophical waters, though still murky, are clearing.

6. References and Further Reading

  • Adams, E. M. “The Meaning of Life.” International Journal for Philosophy of Religion 51 (April 2002): 71-81.
  • Antony, Louise M., ed. Philosophers Without Gods: Meditations on Atheism and the Secular Life. Oxford: Oxford University Press, 2007.
  • Audi, Robert. “Intrinsic Value and Meaningful Life.” Philosophical Papers 34 (2005): 331-55.
  • Augustine. The Confessions of St. Augustine. Trans. by Rex Warner. New York: Mentor, 1963.
  • Baggini, Julian. What’s It All About? Philosophy & the Meaning Of Life. Oxford: Oxford University Press, 2004.
  • Baumeister, Roy F. Meanings of Life. New York: The Guilford Press, 1991.
  • Baumeister, Roy F., Kathleen D. Vohs, Jennifer Aaker, and Emily N. Garbinsky. “Some Key Difference between a Happy Life and a Meaningful Life.” Journal of Positive Psychology 8:6 (2013): 505-516.
  • Benatar, David. Better Never to Have Been: The Harm of Coming into Existence. Oxford: Oxford University Press, 2009.
  • Benatar, David. The Human Predicament: A Candid Guide to Life’s Biggest Questions. New York: Oxford University Press, 2017.
  • Benatar, David, ed. Life, Death & Meaning: Key Philosophical Readings on the Big Questions. Lanham, MD: Rowman & Littlefield Publishers, 2004.
  • Berger, Peter. The Sacred Canopy. New York: Doubleday, 1967.
  • Bernstein, J. M. “Grand Narratives.” in Paul Ricouer: Narrative and Interpretation, ed. David Wood, 102-23. London: Routledge, 1991.
  • Bettelheim, Bruno. The Uses of Enchantment. New York: Knopf, 1978.
  • Bielskis, Andrius. Existence, Meaning, Excellence: Aristotelian Reflections on the Meaning of Life. London: Routledge, 2017.
  • Blessing, Kimberly A. “Atheism and the Meaningfulness of Life.” in The Oxford Handbook of Atheism. New York: Oxford University Press, 2013: 104-118.
  • Bortolotti, Lisa, ed. Philosophy and Happiness. Hampshire, UK: Palgrave Macmillan, 2009.
  • Bradley, Ben. “Existential Terror.” Journal of Ethics 19 (2015): 409-18.
  • Britton, Karl. Philosophy and the Meaning of Life. Cambridge: Cambridge University Press, 1969.
  • Calhoun, Cheshire. “Geographies of Meaningful Living.” Journal of Applied Philosophy 32:1 (2015): 15-34.
  • Campbell, Stephen M., and Sven Nyholm. “Anti-Meaning and Why It Matters.” Journal of the American Philosophical Association 1:4 (Winter 2015): 694-711.
  • Camus, Albert. The Myth of Sisyphus and Other Essays. Translated by Justin O’Brien. New York: Vintage International, 1983.
  • Chappell, Timothy. “Infinity Goes Up On Trial: Must Immortality Be Meaningless?” European Journal of Philosophy 17 (March 2009): 30-44.
  • Cottingham, John. On the Meaning of Life. London: Routledge, 2003.
  • Cottingham, John. The Spiritual Dimension: Religion, Philosophy and Human Value. Cambridge: Cambridge University Press, 2005.
  • Craig, William Lane. “The Absurdity of Life Without God.” in Reasonable Faith: Christian Truth and Apologetics, 3rd Ed., 65-90. Wheaton, IL: Crossway Books, 2008.
  • Crane, Tim. The Meaning of Belief: Religion from an Atheist’s Point of View. Cambridge, MA: Harvard University Press, 2017.
  • Davis, William H. “The Meaning of Life.” Metaphilosophy 18 (July/October 1987): 288-305.
  • Dilthey, Wilhelm. Gesammelte Schriften, 8:208-9, quoted by Theodore Plantinga, Historical Understanding in the Thought of Willhelm Dilthey. Toronto: University of Toronto Press, 1980.
  • Eagleton, Terry. The Meaning of Life. Oxford: Oxford University Press, 2007.
  • The Book of Ecclesiastes.
  • Edwards, Paul. “Life, Meaning and Value of.” in The Encyclopedia of Philosophy, Vol. 4, ed. Paul Edwards, 467-477. New York: Macmillan Publishing Company, 1967.
  • Edwards, Paul. “Why.” in The Encyclopedia of Philosophy, Vols. 7 & 8, ed. Paul Edwards, 296-302. New York: Macmillan Publishing Company, 1972.
  • Flew, Antony. “Tolstoi and the Meaning of Life.” Ethics 73 (January 1963): 110-18.
  • Fischer, John Martin. “Free Will, Death, and Immortality: The Role of Narrative.” Philosophical Papers 34 (November 2005): 379-403.
  • Fischer, John Martin. “Recent Work on Death and the Meaning of Life.” Philosophical Books 34 (April 1993): 65-74.
  • Fischer, John Martin. “Why Immortality is Not So Bad.” International Journal of Philosophical Studies 2 (September 1994): 257-70.
  • Flanagan, Owen. The Really Hard Problem: Meaning in a Material World. Cambridge, MA: MIT, 2007.
  • Ford, David. The Search for Meaning: A Short History. Berkeley, CA: University of California Press, 2007.
  • Frankfurt, Harry. The Importance of What We Care About. New York: Cambridge University Press, 1988.
  • Frankl, Viktor. Man’s Search for Meaning. Boston: Beacon Press, 2006.
  • Friend, David, and the Editors of LIFE. More Reflections on The Meaning of Life. Boston: Little Brown and Company, 1992.
  • Froese, Paul. On Purpose: How We Create the Meaning of Life. Oxford: Oxford University Press, 2016.
  • Gillespie, Ryan. “Cosmic Meaning, Awe, and Absurdity in the Secular Age: A Critique of Religious Non-Theism.” Harvard Theological Review 111:4 (2018): 461-487.
  • Goetz, Stewart. The Purpose of Life: A Theistic Perspective. London: Continuum, 2012.
  • Goetz, Stewart, and Joshua W. Seachris. What is This Thing Called The Meaning of Life? New York: Routledge, 2020.
  • Goldman, Alan H. Life’s Values: Pleasure, Happiness, Well-Being, & Meaning. Oxford: Oxford University Press, 2018.
  • Goodenough, Ursula W. “The Religious Dimensions of the Biological Narrative.” Zygon 29 (December 1994): 603-18.
  • Gordon, Jeffrey. “Is the Existence of God Relevant to the Meaning of Life?” Modern Schoolman 60 (May 1983): 227-46.
  • Gordon, Jeffrey. “Nagel or Camus on the Absurd?” Philosophy and Phenomenological Research 45 (September 1984): 15-28.
  • Haldane, John. Seeking Meaning and Making Sense. Exeter, UK: Imprint Academic, 2008.
  • Hamilton, Christopher. Living Philosophy: Reflections on Life, Meaning and Morality. Edinburgh: Edinburgh University Press, 2009.
  • Haught, John F. Is Nature Enough? Meaning and Truth in the Age of Science. Cambridge: Cambridge University Press, 2006.
  • Hepburn, R. W. “Questions about the Meaning of Life.” Religious Studies 1 (April 1966): 125-40.
  • Himmelmann, Beatrix, ed. On Meaning in Life. Boston: De Gruyter, 2013.
  • Holland, Alan. “Darwin and the Meaning of Life.” Environmental Values 18:4 (2009): 503-518.
  • Holley, David M. Meaning and Mystery: What it Means to Believe in God. Malden, MA: Wiley-Blackwell, 2010.
  • Kahane, Guy. “Our Cosmic Insignificance,” Noûs (2013): 1-28.
  • Karlsson, Niklas, George Loewenstein, and Jane McCafferty. “The Economics of Meaning.” Nordic Journal of Political Economy 30:1: 61-75.
  • Kauppinen, Antti. “Meaningfulness and Time.” Philosophy and Phenomenological Research 84 (2012): 345-77.
  • Kekes, John. “The Meaning of Life.” Midwest Studies in Philosophy 24 (2000): 17-34.
  • Kekes, John. The Human Condition. New York: Oxford University Press, 2010.
  • King, Laura A., Samantha J. Heintzelman, and Sarah J. Ward, “Beyond the Search for Meaning: A Contemporary Science of the Experience of Meaning in Life,” Current Directions in Psychological Science, 25:4 (2016): 211-216.
  • Klemke, E. D., and Steven M. Cahn, eds. The Meaning of Life. 4th edn. New York: Oxford University Press, 2017.
  • Kraay, Klaas J. Does God Matter? Essays on the Axiological Consequences of Theism. New York: Routledge, 2018.
  • Lacey, Alen. “The Meaning of Life,” in The Oxford Companion to Philosophy, 2nd ed., ed. Ted Honderich. New York: Oxford University Press, 2005.
  • Landau, Iddo. Finding Meaning in an Imperfect World. New York: Oxford University Press, 2017.
  • Landau, Iddo. “Life, Meaning of” in The International Encyclopedia of Ethics. Wiley-Blackwell, 2013: 3043-3047.
  • Landau, Iddo. “The Meaning of Life Sub Specie Aeternitatis.” Australasian Journal of Philosophy 89:4 (2011): 727-734.
  • Law, Stephen. “The Meaning of Life.” Think 11 (2012): 25-38.
  • Leach, Stephen, and James Tartaglia, eds. The Meaning of Life and the Great Philosophers. London: Routledge, 2018.
  • Levine, Michael. “What Does Death Have to Do with the Meaning of Life?” Religious Studies 23(1987): 457-65.
  • Levy, Neil. “Downshifting and Meaning in Life.” Ratio 18 (June 2005): 176-89.
  • Lewis, C. S. “De Futilitate.” in Christian Reflections. Grand Rapids, MI: William B. Eerdmans Publishing Company, 1995.
  • Lewis, C. S. “On Living in an Atomic Age,” in Present Concerns. San Diego: Harcourt, Inc., 1986.
  • Luper-Foy, Stephen. “The Absurdity of Life.” Philosophy and Phenomenological Research 52 (1992): 85-101.
  • Lurie, Yuval. Tracking the Meaning of Life: A Philosophical Journey. Columbia, MO: University of Missouri Press, 2006.
  • MacIntyre, Alasdair. After Virtue, 3rd Edn. Notre Dame, IN: University of Notre Dame Press, 2007.
  • Makkreel, Rudolf A. “Dilthey, Wilhelm,” in The Cambridge Dictionary of Philosophy, ed. Robert Audi. Cambridge: Cambridge University Press, 2001.
  • Markus, Arjan. “Assessing Views of Life: A Subjective Affair?” Religious Studies 39 (2003): 125-43.
  • Martela, Frank, and Michael F. Steger, “The Three Meanings of Meaning in Life: Distinguishing Coherence, Purpose, and Significance,” The Journal of Positive Psychology, 11:5 (2016): 531-45.
  • Martin, Michael. Atheism, Morality, and Meaning. Amherst, NY: Prometheus Books, 2002.
  • Mawson, Timothy. God and the Meanings of Life: What God Could and Couldn’t do to Make Our Lives More Meaningful. London: Bloomsbury, 2016.
  • Mawson, Timothy. “Recent Work on the Meaning of Life and Philosophy of Religion.” Philosophy Compass 8 (2013): 1138-1146.
  • Mawson, Timothy. “Sources of Dissatisfaction with Answers to the Question of the Meaning of Life.” European Journal for Philosophy of Religion 2 (Autumn 2010): 19-41.
  • May, Todd. A Significant Life: Human Meaning in a Silent Universe. Chicago: University of Chicago Press, 2016.
  • McDermott, John J. “Why Bother: Is Life Worth Living?” The Journal of Philosophy 88 (November 1991): 677-83.
  • McGrath, Alister E. Surprised by Meaning. Louisville, KY: Westminster John Knox, 2011.
  • Metz, Thaddeus. “The Concept of a Meaningful Life.” American Philosophical Quarterly 38 (April 2001): 137-53.
  • Metz. Thaddeus. “Could God’s Purpose be the Source of Life’s Meaning?” Religious Studies 36 (2000): 293-313.
  • Metz, Thaddeus. “God’s Purpose as Irrelevant to Life’s Meaning: Reply to Affolter.” Religious Studies 43 (December 2007): 457-64.
  • Metz, Thaddeus. God, Soul and the Meaning of Life (Elements in the Philosophy of Religion). Cambridge: Cambridge University Press, 2019.
  • Metz, Thaddeus. “The Immortality Requirement for Life’s Meaning.” Ratio 16 (June 2003): 161-77.
  • Metz, Thaddeus. Meaning in Life. Oxford: Oxford University Press, 2016.
  • Metz, Thaddeus. “The Meaning of Life,” The Stanford Encyclopedia of Philosophy (Summer 2007 Edition), Edward N. Zalta (ed.).
  • Metz, Thaddeus. “New Developments in the Meaning of Life.” Philosophy Compass 2 (2007): 196-217.
  • Metz, Thaddeus. “Recent Work on the Meaning of Life,” Ethics 112 (July 2002): 781-814.
  • Metz, Thaddeus. “Utilitarianism and the Meaning of Life.” Utilitas 15 (March 2003): 50-70.
  • Morris, Thomas V. Making Sense of It All: Pascal and the Meaning of Life (Grand Rapids: William B. Eerdmans Publishing Company, 2002.
  • Moser, Paul K. “Divine Hiddenness, Death, and Meaning,” in Philosophy of Religion: Classic and Contemporary Issues, ed. Paul Copan and Chad Meister, 215-27. Malden, MA: Blackwell Publishers, 2008.
  • Munitz, Milton K. Does Life Have A Meaning? Buffalo, NY: Prometheus Books, 1993.
  • Nagel, Thomas. “The Absurd.” The Journal of Philosophy 68 (1971): 716-27.
  • Nagel, Thomas. Secular Philosophy and the Religious Temperament: Essays 2002-2008. Oxford: Oxford University Press, 2010.
  • Nozick, Robert. “Philosophy and the Meaning of Life.” in Philosophical Explanations. Cambridge, MA: Belknap, 1981. 571-79; 585-600.
  • O’Brien, Wendell. “Meaning and Mattering.” The Southern Journal of Philosophy 34 (1996): 339-60.
  • Oliva, Mirela. “Hermeneutics and the Meaning of Life.” Epoché 22:2 (Spring 2018): 523-39.
  • Pascal, Blaise. Pensées. Translated by A. J. Krailsheimer. London: Penguin Books, 1995.
  • Perrett, Roy W. “Tolstoy, Death and the Meaning of Life.” Philosophy 60 (April 1985): 231-45.
  • Pritchard, Duncan. “Absurdity, Angst, and the Meaning of Life.” The Monist 93 (January 2010): 3-16.
  • Rosenburg, Alex. The Atheist’s Guide to Reality: Enjoying Life Without Illusions. New York: Norton, 2011.
  • Rosenberg, Alex, and Tamler Sommers. “Darwin’s Nihilistic Idea.” Biology and Philosophy 18 (2003): 653-68.
  • Ruse, Michael. A Meaning to Life. Oxford: Oxford University Press, 2019.
  • Ruse, Michael. On Purpose. Princeton: Princeton University Press, 2017.
  • Russell, Bertrand. “A Free Man’s Worship.” in Why I Am Not a Christian and Other Essays on Religion and Related Subjects. New York: Touchstone, 1957. 104-16.
  • Russell, L. J. “The Meaning of Life.” Philosophy 28 (January 1953): 30-40.
  • Sartre, Jean-Paul. Existentialism & Humanism. Translated by Philip Mairet. London: Methuen, 1973.
  • Sartre, Jean-Paul. Nausea. Translated by Lloyd Alexander. New York: New Directions, 1964.
  • Schopenhauer, Arthur. Essays and Aphorisms. Translated by R. J. Hollingdale. London: Penguin Books, 2004.
  • Seachris, Joshua. “Death, Futility, and the Proleptic Power of Narrative Ending.” Religious Studies 47:2 (June 2011): 141-63.
  • Seachris, Joshua. “From the Meaning Triad to Meaning Holism: Unifying Life’s Meaning” Human Affairs 49:4 (2019).
  • Seachris, Joshua W. “The Meaning of Life as Narrative: A New Proposal for Interpreting Philosophy’s ‘Primary’ Question.” Philo 12 (Spring-Summer 2009): 5-23.
  • Seachris, Joshua W. “The Sub Specie Aeternitatis Perspective and Normative Evaluations of Life’s Meaningfulness: A Closer Look,” Ethical Theory and Moral Practice 16 (2013): 605-620.
  • Seachris, Joshua, ed. Exploring the Meaning of Life: An Anthology and Guide. Malden, MA: Blackwell, 2012.
  • Seachris, Joshua, and Stewart Goetz. eds. God and Meaning: New Essays. New York: Bloomsbury Academic, 2016.
  • Setiya, Kieran. Midlife: A Philosophical Guide. Princeton: Princeton University Press, 2017.
  • Sharpe, R. A. “In Praise of the Meaningless Life.” Philosophy Now 25 (Summer 1999): 15.
  • Sherry, Patrick. “A Neglected Argument for Immortality.” Religious Studies 19 (March 1983): 13-24.
  • Sigrist, Michael J. “Death and the Meaning of Life.” Philosophical Papers 44:1 (March 2015): 83-102.
  • Singer, Irving. The Creation of Value. Volume 1 of Meaning in Life. Baltimore: The Johns Hopkins University Press, 1996.
  • Smart, J. J. C. “Meaning and Purpose.” Philosophy Now 24 (Summer 1999): 16.
  • Smith, Michael. “Is That All There Is?” The Journal of Ethics 10 (January 2006): 75-106.
  • Smuts, Aaron. “The Good Cause Account of the Meaning of Life.” Southern Journal of Philosophy 51:4 (2013): 536-62.
  • Steger, Michael F. “Experiencing meaning in life: Optimal functioning at the nexus of well-being, psychopathology, and spirituality,” (pp. 165-184) in The Human Quest for Meaning, Ed. P. T. P. Wong. New York: Routledge, 2012.
  • Suckiel, Ellen Kappy. “William James on the Cognitivity of Feelings, Religious Pessimism, and the Meaning of Life.” The Journal of Speculative Philosophy 17 (2003): 30-39.
  • Svendsen, Lars. A Philosophy of Boredom. Trans. by John Irons. London: Reaktion Books, 2005.
  • Tartaglia, James. Philosophy in a Meaningless Life. London: Bloomsbury Academic, 2015.
  • Taylor, Richard. “Time and Life’s Meaning.” Review of Metaphysics 40 (1987): 675-86.
  • Taylor, Richard. “The Meaning of Life.” in Good and Evil. New York: Macmillan Publishing, 1967.
  • Thomas, Joshua Lewis. “Meaningfulness as Sensefulness.” Philosophia (2019): https://doi.org/10.1007/s11406-019-00063-x.
  • Thomson, Garrett. On the Meaning of Life. London: Wadsworth, 2003.
  • Tolstoy, Leo. “A Confession.” in Spiritual Writings. Maryknoll, NY: Orbis Books, 2006.
  • Trisel, Brooke Alan. “Futility and the Meaning of Life Debate.’ Sorites 14 (2002): 70-84.
  • Trisel, Brooke Alan. “Human Extinction and the Value of Our Efforts.” The Philosophical Forum 35 (Fall 2004): 371-91.
  • Trisel, Boorke Alan. “Human Extinction, Narrative Ending, and Meaning of Life.” Journal of Philosophy of Life 6:1 (April 2016): 1-22.
  • Vernon, Mark. After Atheism: Science, Religion, and the Meaning of Life. New York: Palgrave Macmillan, 2008.
  • Waghorn, Nicholas. Nothingness and the Meaning of Life: Philosophical Approaches to Ultimate Meaning Through Nothing and Reflexivity. London: Bloomsbury, 2014.
  • White, Heath. “Mattering and Mechanism: Must a Mechanistic Universe be Depressing?” Ratio 24 (September 2011): 326-39.
  • Wielenberg, Erik J. Value and Virtue in a Godless Universe. Cambridge: Cambridge University Press, 2006.
  • Williams, Bernard. “The Makropulos Case: Reflections on the Tedium of Immortality.” in The Metaphysics of Death, ed. John Martin Fischer, 73-92. Stanford, CA: Stanford University Press, 1993.
  • Wisnewski, J. J. “Is the Immortal Life Worth Living?” International Journal for Philosophy of Religion 58 (2005): 27-36.
  • Wolf, Susan. “Happiness and Meaning: Two Aspects of the Good Life.” Social Philosophy and Policy 14 (December 1997): 207-25.
  • Wolf, Susan. Meaning in Life and Why It Matters. Princeton: Princeton University Press, 2010.
  • Wolf, Susan. “Meaningful Lives in a Meaningless World,” Quaestiones Infinitae 14 (June 1997): 1-22.
  • Wright, N. T. The Resurrection of the Son of God. Vol. 3. Christian Origins and the Question of God.
  • Minneapolis: Fortress Press, 2003.
  • Young, Julian. The Death of God and the Meaning of Life. London: Routledge, 2004.
  • Young, Julian. “Nihilism and the Meaning of Life.” in The Oxford Handbook of Continental Philosophy, eds. Brian Leiter and Michael Rosen. Oxford: Oxford University Press, 2007.

Author Information

Joshua Seachris
Email: jseachris@nd.edu
University of Notre Dame
U. S. A.

Cognitive Penetrability of Perception
and Epistemic Justification

Perceptual experience is one of our fundamental sources of epistemic justification—roughly, justification for believing that a proposition is true. The ability of perceptual experience to justify beliefs can nevertheless be questioned. This article focuses on an important challenge that arises from countenancing that perceptual experience is cognitively penetrable.

The thesis of cognitive penetrability of perception states that the content of perceptual experience can be influenced by prior or concurrent psychological factors, such as beliefs, fears and desires. Advocates of this thesis could, for instance, claim that your desire to have a tall daughter might influence your perception, so that she appears to you to be taller than she is. Although cognitive penetrability of perception is a controversial empirical hypothesis, it does not appear implausible. The possibility of its veracity has been cited in order to challenge positions that maintain that perceptual experience has inherent justifying power.

This article presents some of the most influential positions in contemporary literature about whether cognitive penetration would undermine perceptual justification and why it would or would not do so.

Some sections of this article focus on phenomenal conservatism, a popular conception of epistemic justification that more than any other has been targeted with objections that appeal to the cognitive penetrability of experience

Table of Contents

  1. Cognitive Penetrability of Perception and its Consequences
    1. What is Cognitive Penetrability?
    2. The Epistemic Problem of Cognitive Penetrability
  2. Responses to the Epistemic Problem of Cognitive Penetrability
    1. Internalist Resolute Solutions
      1. The Defeasibility Approach
      2. The Intuitive Plausibility Approach
      3. The Different Epistemic Status Approach
    2. Externalist Concessive Solutions
    3. Internalist Concessive Solutions
      1. Process Inferentialism
      2. The Receptivity Approach
      3. The Knowledge-How Account
      4. Presentational Conservatism
    4. Other Options
      1. Sensible Dogmatism
      2. The Imagining Account
      3. The Analogy with Emotions
      4. The Sensorimotor Theory of Perception
  3. Conclusion
  4. References and Further Reading

1. Cognitive Penetrability of Perception and its Consequences

a. What is Cognitive Penetrability?

Our perceptual experiences present to us (accurately or not) facts in the world. For instance, you can have an experience as if a bird is singing or as if this ball is red. In these cases, that a bird is singing and that this ball is red can be said to be the representational contents of your experiences.

The cognitive penetrability of perception is a controversial empirical thesis that holds that the content of perceptual experience can partly be shaped by prior or concurrent psychological factors, such as beliefs, desires, traits, moods, entertained hypotheses, conjectures, emotions, expectations, hopes, wishes, doubts, suspicions, attitudes or knowledge that can be acquired through the right training. Whether cognitive penetrability of perception is a real phenomenon is investigated by cognitive science (Raftopoulos and Zeimbekis 2015). Relevant scientific experiments are described for instance in Payne (2001), Hansen et al. (2006), and Stokes and Payne (2011).

To familiarize ourselves with the notion of cognitive penetrability of perception, let us consider two imaginary cases of cognitive penetration: Siegel’s (2013a, 2017) Angry Jack and Markie’s (2005, 2006, 2013) Expert and Novice case (adjusted for the purposes of this article).

Angry Jack

Jill believes without good reason that Jack is angry. When she meets Jack, under the influence of her unjustified belief that Jack is angry, she sees Jack as being angry. Based on her perceptual experience as if Jack is angry, she retains the same belief and, perhaps, her confidence that Jack is angry is even enhanced. Had she not had the prior belief that Jack is angry with her, she would not have seen him as being angry.

Expert and Novice

Two friends are gold prospectors. One of them is an expert at identifying gold. He has learned to do so through long experience. He began with a list of identification rules and consciously applied them. He then reached the point where he could “just see” that a nugget is gold. The other friend is a novice. He has a general sense of what gold looks like, but he is not very good at its visual identification. He nevertheless craves for making a discovery. When the two friends happen to look at a nugget in a pan, the expert’s developed gold-identification abilities come into play, and he has the perceptual experience as if the nugget is gold. The expert believes accordingly. The novice’s strong desire that it be gold comes into play too, and he also has the perceptual experience as if the nugget is gold. The novice believes accordingly. Had the novice not had a strong desire to find gold, he would not have had the perceptual experience as if the rock is gold. Had the expert not had very developed gold-identification abilities, he would not have had the experience as if the rock is gold.

These two cases are supposed to be situations in which the contents of the relevant perceptual experiences are somewhat influenced by the subject’s prior mental states. Jill’s experience is influenced by her prior belief that Jack is angry. The novice’s experience is influenced by his strong desire to find gold, and the expert’s experience is influenced by his knowledge and experience. They are possible cases of cognitive penetration of perception.

As we see in the next section, the problem that cognitive penetrability poses to theories of perceptual justification rests on the intuition that in at least some cases in which perceptual experience is cognitively penetrated, justification is affected negatively. For instance, despite her experience as if Jack is angry, there seems to be something wrong in claiming that Jill has justification for believing that Jack is angry. The same applies to the novice’s case.

Arguably, there are also cases of good cognitive penetration of perception: namely, situations in which the subject’s experience is actually a good basis for some of her beliefs just because it is cognitively penetrated.

An example might be the expert’s cognitively penetrated experience as if the pebble is gold in Expert and Novice. Siegel (2012) provides another possible example in which a cognitively penetrated experience of an expert radiologist inspecting the X-ray of a patient is contrasted with a non-penetrated experience of a non-expert who attends to the same X-ray. Lyons (2011) suggests further examples involving perceptual learning as cases of good cognitive penetration. Perceptual learning is a process based on training and experience that ends up producing changes in the subject’s perceptual abilities (Connolly 2017). Perceptual learning is a form of diachronic cognitive penetration. Lyons also imagines a case of synchronic good cognitive penetration—the Snake Case—involving the sharpening of one’s snake-detection skills in virtue of one’s unjustified belief or fear that there are snakes in one’s trail.

Before going deeper into the relations between cognitive penetration and epistemic justification, we need to have a more accurate picture of what cognitive penetration of perceptual experience consists of.

Not just any kind of influence on perception by psychological states produces cognitive penetration. Some mental states might influence perceptual experience indirectly simply because they change the location from where the subject receives the perceptual stimuli. For example, if I desire to watch TV, I will turn my head towards the TV. So my experience will change from representing the monitor of my laptop to representing the TV. The change in perception imputable to cognitive penetration must not be explainable in terms of a reception of different perceptual stimuli due to body movements, defects of our sensory organs or—more controversially—a difference in the spatio-temporal locations attended to by the subject’s covert attention (Stokes 2012 and Vance 2014).

Siegel (2012) for instance excludes voluntary shift of attention from the definition of cognitive penetration. Nevertheless, she mentions as interesting cases of cognitive penetration that involve relative indifference to stimuli or an attentional selection bias in favor of only particular loci of the stimuli.

For the time being, let us follow Siegel (2012) in accepting that in most cases of cognitive penetration this counterfactual would be satisfied: if S had a cognitive mental state different from the one she actually has, but attended to the same perceptual stimuli as those she actually attends to, S would not have the same perceptual experience. For instance, if the belief that Jack is angry were not part of Jill’s mental state, but Jill still attended to the very same features of Jack’s face, she would not have the perceptual experience as if Jack is angry.

Many philosophers of mind and epistemologists agree that perceptual experience has at least two interplaying components: sensory impressions (for example, colors, smells and tastes), and concepts (for example, the concept of bird and the concept of ball). These philosophers would claim that in order to have the experience as if, say, this ball is red, you need to combine a round and a red sense impression together with the concepts of ball and red into one suitable representational state.

As we later see, the thesis that the perceptual experience of a subject S can be cognitively penetrated is often interpreted in a disjunctive fashion as stating that the sensory impression component or the conceptual component of S’s experience can be cognitively penetrated. In the first case, S’s prior or concurrent mental states would directly change the low-level, non-conceptual part/stage of S’s experience. For instance, suppose that under the influence of her belief that Jack is angry, Jill comes to have visual sensations that typically lead to the formation of higher-level conceptual angry-face-representation. On the grounds of these sensations, it does appear to her that Jack is angry. In the second case, S’s prior or concurrent states would directly affect the part/stage of S’s experience that is conceptual. One could interpret the novice prospector case as an example of this: the novice’s strong desire to find gold produces an experience that, thanks to the concepts embedded in it, represents the pebble before him as gold.

It is important to distinguish S’s perceptual experiences and S’s doxastic states that can accompany these experiences. A perceptual experience as if P may be accompanied by a belief or judgment that P, but this belief or judgment would not be a part of the perceptual experience. Suppose for instance that S does have a perceptual experience as if this ball is red. Concurrently, S may or may not believe or judge that this ball is red. In the same way, one’s perceptual experience as if P may be accompanied by one’s reflective belief that one has a perceptual experience as if P, but this reflective belief would be something distinct from the perceptual experience. Suppose again that S has a perceptual experience as if this ball is red. Concurrently, S may or may not entertain a reflective belief that she has an experience as if this ball is red.

It does not seem implausible that S’s previous or concurrent mental states could directly influence S’s perceptual or reflective beliefs without affecting S’s perceptual experiences. Imagine, for instance, that though Jill does have a perceptual experience as if Jack is not angry, she forms an inaccurate perceptual belief that Jack is angry because she fears that Jack is angry. Alternatively, imagine that although Jill has a perceptual experience as if Jack is not angry, she forms a mistaken reflective belief that she has a perceptual experience as if Jack is angry, due to her belief that Jack is angry

Most of the philosophers involved in the debate on cognitive penetrability would not consider cases like those just described to be genuine examples of cognitive penetration of perceptual experience. The basic problem is that they do not concern effects of S’s mental states on S’s perceptual experience.

Nevertheless, for a comprehensive conception of cognitive penetrability of perception that includes cases like the ones just described, see Lyons (2011). Siegel (2015, 2017) discusses another comprehensive view according to which previous or concurrent mental states of the subject can affect the subject’s perceptions, conceived of in a broadened sense to include also, for instance, experiential judgments and patterns of attention. However, Siegel is careful in using the expression “perceptual farce” just to refer to this general view and in distinguishing it from the more specific view that perceptual experience is cognitively penetrable.

The remainder of this article takes cognitive penetrability to be a phenomenon pertaining to the conceptual component or the sensory impression component of experience.

b. The Epistemic Problem of Cognitive Penetrability

Perceptual experience is, so to speak, the tribunal by which most beliefs can be checked with respect to their epistemic status. The epistemological problem of cognitive penetrability essentially stems from a clash of two conflicting intuitions about the credentials of this tribunal. The first intuition says that perceptual experiences in general possess the kind of intrinsic features that would make the beliefs based on them justified. The second, contrasting intuition says that badly cognitively penetrated experiences—such as the experiences of Jill in Angry Jack and the novice in Expert and Novice—cannot actually justify the beliefs based on them (see Lyons 2016). As it will shortly become clear, the philosophical question underlying this clash of intuitions is whether the causal history—or etiology—of an experience can affect its justificatory power.

It is important to appreciate that although cognitive penetrability is a controversial empirical hypothesis, scientific investigation is not crucially relevant to this epistemological debate. Those who share the intuition that perceptual experiences have intrinsic features that make the beliefs based on them justified typically take this claim to be true a priori of any possible contentful experience as such. In consequence, if cognitive penetration were incompatible with the justificatory power of perceptual experience, even if our hardwiring ruled out cognitive penetrability, the mere possibility of a rational being suffering from cognitive penetration of perception would constitute a threat to that intuition (Markie 2013 and Tucker 2019).

To probe these complex issues, we need now to introduce some basic epistemological notions and individuate one theory of perceptual justification to use as a good example.

Internalists about epistemic justification claim that all the factors that make a subject S possess justification for believing a proposition are (i) reflectively accessible to S or (ii) mental states of S. In case (i), the view is called accessibilism; in case (ii), it is called mentalism. Factors that provide S with justification could for instance be other beliefs of S or her experiences. Externalists about justification deny both (i) and (ii) (see Pappas 2014 and Poston 2018). For example, according to a prominent form of externalism called reliabilism, what renders a belief of S justified is its being produced by a (statistically) reliable process, regardless of whether the process is reflectively accessible to S or not, and of its being wholly mental or not (see Goldman 1979).

Phenomenal conservatism (Huemer 2001 and 2007) is the theory of epistemic justification that many if not most early twenty-first century internalists invoke to account for the justificatory power of experiences. (See Audi 1993 and Pryor 2000 for similar views.) In accordance with it, it is a priori true that:

(PC) If S has a seeming that P, S thereby has prima facie justification for believing P.

Seemings (or appearances) are typically conceived of as experiences provided with a propositional content. (Some phenomenal conservatives think of a perceptual seeming as, specifically, the conceptual component of an experience. For others, a perceptual seeming is made of the conceptual component together with the sensory impression component of an experience.) Although seemings may include more than perceptual experiences—some philosophers think there are, for example, rational, moral and mnemonic seemings—we focus here on perceptual seemings.

(PC) is to be interpreted as stating that if S has a seeming that P and no defeating evidence, S possesses both prima facie and all things considered justification for believing P; whereas if S does have defeating evidence, S possesses only prima facie justification for believing P. Defeating evidence can be any reason for S to believe that P is false or that the seeming that P is deceptive. The ‘thereby’ in (PC) indicates that S’s justification for P comes solely from her seeming that P. Since it does not result from any belief of S, this justification is immediate.

Phenomenal conservatism is customarily taken to be an internalist—both accessibilist and mentalist—theory of justification because it fits with (though it does not entail) the assumption that S’s justification depends only on mental factors reflectively accessible to S—namely, S’s appearances and the absence of defeating evidence.

Let us now investigate the problem of cognitive penetrability in relation to phenomenal conservatism. This is indeed the theory of justification that has been mostly discussed in this context. (See Siegel 2012 and Tucker 2014 about the significance of cognitive penetrability for other theories of epistemic justification.)

Phenomenal conservatism accounts for the internalist intuition that perceptual experiences in general have intrinsic features capable of justifying the beliefs based on them. Suppose S has an experience with content P. If (PC) is correct, S thereby has at least prima facie justification for believing P. Phenomenal conservatism has attracted objections by many epistemologists—both internalist and externalist—who share the contrasting intuition that it is in many cases implausible that a cognitively penetrated experience can justify— even only prima facie—a belief.

Siegel (2012) has described a way in which this intuition becomes palpable: cognitive penetration of perceptual experience seems to allow for the elevation of S from a worse epistemic position to a better one in cases in which such an elevation appears illegitimate or impossible. This epistemic elevation may occur when the penetrating state is unjustified or when it is justified. An instance of the first case is the one in which S gets support for an initially unjustified belief B entertained by her from B itself, through the mediation of an experience cognitively penetrated by B. This is what arguably happens to Jill in Angry Jack: Jill gets support for her initially unjustified belief (B) that Jack is angry from the very same belief B, thanks to the mediation of the perceptual experience as if Jack is angry, cognitively penetrated by B. An instance of the second case would be one where S gets additional support for a justified belief on the basis of a perceptual experience cognitively penetrated by that very same belief. Imagine that, before meeting Jack, Jill forms a justified belief (B) that Jack is angry, for she receives a furious email from him. This prior justified belief B makes Jill have the experience as if Jack is angry when she meets him later on. Thanks to this experience, Jill would get additional support for B.

To facilitate our discussion let us introduce the downgrade thesis (Siegel 2013a and Teng 2016). This thesis holds that a badly cognitively penetrated perceptual experience as if P provides less prima facie justification for believing P than a non-penetrated perceptual experience sharing the same content P. Precisely, if the whole content of the experience is badly cognitively penetrated, the experience as a whole is downgraded; and if only a part of it is badly cognitively penetrated, only that part of the experience is downgraded. For example, suppose S has a badly cognitively penetrated experience as if there is a red car before her. If what is badly cognitively penetrated is just the part of S’s experience that represents the car’s color, S’s experience is downgraded only with respect to the color. Thus, S has prima facie justification for believing that there is a car before her, but less or no prima facie justification for believing that the car is red (Teng 2016).

There is an interesting similarity between the cognitively penetrated experiences of a subject S and the experiences that S would have if she were a victim of a skeptical scenario (such as the Matrix scenario or the evil demon scenario envisaged by Descartes). In both cases, S’s experiences would have anomalous etiologies. In the first case, some mental state of S would interfere with S’s normal causal chains that produce experiences of a certain type. For example, the novice prospector’s craving for gold interferes with his normal visual processes. In the second case, the distal causes of S’s perceptual experiences would be unnatural. For example, if S were in the Matrix, the external cause of her visual experience of a cat would be the Matrix rather than a cat. Despite this similarity, many internalists tend to treat the cases of bad cognitive penetration and the cases of skeptical scenarios differently.

Internalists generally agree that when S is in a skeptical scenario, the anomalous etiologies of S’s perceptual experiences do not downgrade these experiences, so they do not affect their justifying power. The reason being that the segments of the etiologies of the perceptual experiences that make S a victim of a skeptical scenario are neither accessible to nor mental sates of S, which means they could not affect S’s perceptual justification. Internalists agree that if S were in a skeptical scenario, her perceptual beliefs would be at least prima facie justified when based on appropriate experiences. Internalists have long been using this argument to attack externalists about justification. Externalists seem in fact to be committed to holding that S’s perceptual beliefs would be all unjustified if S were deceived by the Matrix or a Cartesian demon. These beliefs would therefore be all false, which would entail that they are produced by unreliable processes (Poston 2018).

When it comes to cognitive penetrability, nevertheless, many internalists and externalists agree that if a perceptual experience of S were badly cognitively penetrated, it would be downgraded to the effect that S would lack prima facie justification for believing its content (Siegel 2012 and Tucker 2013). Externalists could defend this view by insisting that the anomalous etiologies of these perceptual experiences make the processes producing the correlated perceptual beliefs unreliable. Nevertheless, it is not immediately clear why the etiologies of cognitively penetrated experiences and the etiologies of experiences in skeptical scenarios should be considered to be so relevantly different from an internalist point of view. As we see later in the article, certain responses to the epistemic problem of cognitive penetration aim to illuminate this issue too.

2. Responses to the Epistemic Problem of Cognitive Penetrability

The debate on cognitive penetrability and perceptual justification has at least three basic and influential sides. One is the internalist resolute side, which aims to reject the downgrade thesis. For the most part, this is the side of the advocates of phenomenal conservatism. Another side is the externalist reliabilist one, which rejects (PC), does subscribe to the downgrade thesis and explains the weakening or annihilation of the justificatory power of badly cognitively penetrated experiences in terms of unreliability. The third side belongs to the internalist camp, but it deviates from the resolute one. This third side—called here the internalist concessive side—accepts the downgrade thesis but attempts to explain why perceptual justification is undermined in bad cognitive penetration cases, with the aim of, simultaneously, respecting internalist principles. The epistemologists belonging to this side all reject (PC), but some propose views that could be described as variants of phenomenal conservatism. Beyond these three fundamental sides, there are accounts that offer solutions to the problem of cognitive penetrability that do not fit the internalism-externalism dichotomy. The following sub-sections are dedicated to the presentation of key arguments that have been developed within all the aforementioned approaches, as well as to important objections to them.

a. Internalist Resolute Solutions

There are at least three distinct but not incompatible approaches adopted by internalists who reject the downgrade thesis: (i) the defeasibility approach, according to which cognitive penetration does not affect prima facie justification but can only influence all things considered justification; (ii) the intuitive plausibility approach, which rejects the downgrade thesis by heavily relying on internalist intuitions about the irrelevance of etiology as a justificatory factor and intuitions about the justifying power that perceptual experiences have thanks to their intrinsic features; and (iii) the different epistemic status approach, according to which in bad cognitive penetration cases the subject lacks not epistemic justification but rather some other epistemic property or status.

i. The Defeasibility Approach

According to the defeasibility approach, all cases of bad cognitive penetration can be construed as situations where S does have defeating evidence; namely, S suspects, believes or is in some other sense aware that (1) her perceptual experience would have been different if her prior mental state had been different; or S suspects, believes or is in some other sense aware that (1) and that (2) her prior mental state was unjustified or an unreliable guide to truth (see Siegel 2012 and Huemer 2013b). For instance, in Expert and Novice, arguably, the novice is in some sense aware that (1) he would not have had the experience as if the pebble is gold if he had not had the desire to find gold; or he is in some sense aware of both (1) and that (2) one’s craving for gold can make one’s perceptual experience of gold unreliable.

The advocates of this strategy contend that in all cases of bad cognitive penetration, S’s prima facie justification remains untouched. S would instead lack all things considered justification in virtue of having an evidential defeater. These epistemologists emphasize that this is coherent with the account of prima facie justification based on (PC) (Huemer 2013b).

An expected criticism says that in many cases of bad cognitive penetration, S is not actually aware that her perceptual experience is cognitively penetrated or that her cognitively penetrated experience is an unreliable guide to truth, though S could become aware of it (McGrath 2013b and Markie 2013). In response one might appeal to a weaker notion of evidential defeater. One might contend that S would have an evidential defeater even if one were just able to become aware of it, without being actually aware of it (see Siegel 2012). But this would not resolve all problems because the mental state that should work as an evidential defeater might be such that S could not possibly become aware of it (Siegel 2012 and Markie 2013). For example, think of a variant of Angry Jack in which Jill, because of inborn or induced cognitive deficiencies, is incapable of coming to believe that her perceptual experience would have been different if she had had a different prior cognitive state.

The main reason of concern about the defeasibility approach, however, stems from the intuition, which some epistemologists have, that in the case of bad cognitive penetration the subject would lack even prima facie justification (Markie 2005, Huemer 2013b and Tucker 2014).

ii. The Intuitive Plausibility Approach

Phenomenal conservatives may try to defend the contention that in the case of bad cognitive penetration, S would at least have prima facie justification by highlighting its plausibility against a background of internalist intuitions. A key thesis adduced in this context is that perceptual experiences have justifying power in virtue of being experiences, rather than in virtue of having a particular sort of etiology (see Lyons 2016). In accordance with this view, perceptual experiences can differ in their epistemic power only in virtue of their intrinsic factors, not because of their etiologies.

Let us see how this response can be developed. The intuitive plausibility approach aims to support the claim that (i) reflectively inaccessible etiologies of perceptual experiences in cognitive penetration cases play no role in determining whether or not perceptual experiences provide prima facie justification, and the claim that (ii) perceptual experiences possess intrinsic justificatory force. (i) and (ii) are two sides of the same coin.

A way to support (i) is to appeal to the absence of essential differences between bad cognitive penetration cases and zap-like cases (Siegel 2012). ‘Zap-like’ is the expression used by Siegel (2013a) to refer indifferently to scenarios involving bump-on-the-head situations (that is, cases in which S has a hallucination caused by a knock or bump on her head) and skeptical scenarios (involving, for instance, evil demons or the Matrix). Internalists may insist that cognitive penetration cases are not substantially different from zap-like cases. After all, the etiologies of perceptual experiences in cognitive penetration cases are processes reflectively inaccessible to the subject S, just as the etiologies of zap-like cases. Furthermore, the etiologies of perceptual experiences in cognitive penetration cases are processes that do not seem to be subject to S’s rational control, just as the etiologies of zap-like cases. It may appear plausible that the etiology of S’s perceptual experience in a zap-like scenario plays no role in determining whether or not S’s perceptual experience provides S with prima facie justification for her beliefs. (For instance, it may appear plausible that if an evil demon causes Jill’s perceptual experience as if Jack is angry, this fact cannot interfere with the prima facie justification for believing that Jack is angry, which Jill possesses in virtue of her experience. For the evil demon’s actions are reflectively inaccessible to Jill and are not subject to Jill’s rational control.) Since the cases of cognitive penetration are not relevantly different from the zap-like cases in terms of their etiologies, it can be argued that the latter play no role in determining whether or not S’s perceptual experience provides S with prima facie justification for her beliefs.

Although internalists may welcome this defense of (i), many externalists will not concede at the outset that justification is not negatively affected in zap-like cases. They will contend that since the relevant perceptual experiences are misleading in these cases, the correlated belief-formation processes are unreliable. These externalists would conclude that if we appeal to absence of essential differences, we must accept that prima facie justification is negatively affected in cases of bad cognitive penetration too.

A different criticism of this defense of (i) targets the claim that the etiologies of perceptual experiences in cognitive penetration cases are not subject to S’s rational control, just as the etiologies of zap-like cases. The claim is that whereas S may in certain cases be able to avoid bad cognitive penetration by controlling known factors that lead to it, S could not by assumption control the factors that make her a victim of zap-like cases (Siegel 2012 and 2013a). But even if it were established that the etiologies of perceptual experiences in cases of cognitive penetration are not subject to S’s rational control, there could be a debate about whether the etiologies of perceptual experiences in cases of cognitive penetration are in some sense attributable to S in a way that the etiologies of experiences in zap-like cases are not (Siegel 2013a). Internalist accessibilists can nevertheless insist that despite these complications, it is the shared inaccessibility of the etiologies of zap-like cases and cognitive penetration cases that make these cases homologous. S0 the claim would be that if S is unaware of the defective etiology in bad cognitive penetration cases, just as it happens to S in zap-like cases, the etiology must be irrelevant to S’s justification in those cases.

A more direct way to defend (i) is adducing the phenomenology (or subjective features) that a cognitively penetrated perceptual experience shares with a non-penetrated perceptual experience with the same content (see Siegel 2012). For instance, Jill’s perceptual experience as if Jack is angry looks the same when it is the effect of cognitive penetration and when it is not. The perceptual experiences in these two cases are identical in terms of what is introspectively accessible. It could therefore be argued that whether or not an experience is the effect of cognitive penetration is irrelevant to what one has prima facie reason to believe or not. Only evidence of a distorting etiology could be a defeater and affect all things considered justification (Huemer 2013a, see also Silins 2016).

Another way to support (i) is appealing to the intuition that it is implausible that S’s justification for an attitude A could depend on reasons that S could not adduce to explain whether A is justified or not. For instance, an argument by Huemer in defense of (i) considers a case in which S is unable to draw an epistemically significant distinction between the penetrated part and the non-penetrated part of the content of one and the same perceptual experience. Imagine I have one partly cognitively penetrated perceptual experience as if there is a gun and a box with eggs in the fridge. The gun-like part of my perceptual experience is cognitively penetrated, whereas the box-like is not.

I accept E [that there is a box with eggs in the fridge] on the basis of my visual experience. G [that there is a gun in the fridge] also appears to be equally well supported by my visual experience, and I have no reason for thinking the experience representing G to be any less reliable, nor epistemically inferior in any manner whatsoever, to the experience representing E. Nor have I any other grounds for doubting G. Nevertheless, while I accept E, I refuse to accept G, for no apparent reason . . . This attitude . . . strikes me as obviously irrational. I conclude that . . . [I] epistemically ought to accept G . . . If S would have no rational way of explaining why she believed E while refusing to accept G, then S would be irrational to believe E while refusing to accept G (Huemer 2013a, pp. 745–746).

This argument assumes that whether S is justified or unjustified in believing P depends on whether S can potentially appeal to the reasons that make herself justified or unjustified (Siegel 2013b). Given this assumption, S is not unjustified in believing P unless she can rationally explain why she is so. According to this line of thought, justification depends only on reflectively accessible factors. For S’s being in principle able to appeal to the reasons that determine whether she is justified or not in believing P requires S to be able to reflectively access those reasons. Given this, the etiology of perceptual experiences in cognitive penetration cases is irrelevant to S’s justification insofar as it is reflectively inaccessible to S. Setting aside general criticism of accessibilism, a concern about this strategy is that it is not uncontroversial that S can be justified or unjustified in adopting an attitude A only if S is potentially able to rationally explain why she is justified or unjustified in adopting A. (See two apparent counterexamples in McGrath 2013a and in Siegel 2013b).

We have considered ways of supporting or questioning (i)—the thesis that reflectively inaccessible etiologies of perceptual experiences in cognitive penetration cases are irrelevant to prima facie justification. Let us turn to (ii)—the thesis that perceptual experiences possess intrinsic justificatory force. (ii) is directly supported by an apparently straightforward argument resting on an intuition about what attitude S is rationally supposed to adopt, from her point of view, when S entertains a given mental state (McGrath 2013a). If S has an experience as if P and no evidence against P, the most reasonable attitude to take from S’s point of view is belief, rather than disbelief or a suspension of judgment. A parallel argumentative line interprets perceptual experiences as evidence (McGrath 2013a). Considering that S, as a rational believer, has to match her belief to the evidence E available to her, S should form only beliefs that fit E, whatever E might be. Even if, unbeknownst to S, E were acquired through a biased search or flawed method of evidence-gathering, E would constitute the evidence available to S. So, S should adjust her doxastic attitude in a way to fit E, independently of its etiology.

A further way of defending (ii) might be appealing to coherence requirements derived from an experience as if P. Suppose S does not have justification for believing P, but nevertheless S does believe P. In this case it is rational for S to believe, say, P-or-Q and disbelieve, say, Not-P. In general, if S is in a mental state M, S is rationally required to think in a particular way in virtue of coherence requirements derived from being in M, regardless of the credentials of M. One could argue that, in the same way, S has prima facie justification for believing R if S has a perceptual experience as if R, in virtue of coherence requirements and independently of the credentials of the experience—so independently of its etiology (see McGrath 2013a).

However, a reply would be that even if it is rational for S to believe P-or-Q when S believes P, in this case S does not necessarily have justification for believing P-or-Q. For S may not have justification for believing P in the first instance (McGrath 2013a and Ghijsen 2016). The intuition that this reply exploits is that the kind of rationality that would provide S with justification for believing P-or-Q is not reducible to coherence requirements. The rationality resting solely on coherence is a sort of conditional rationality: it can provide S with justification for P-or-Q only if S has justification for believing P in the first instance.

An illuminating distinction is the one between rational commitment and justification. If S believes P without justification, she is rationally committed to, for instance, disbelieving not-P and believing P-or-Q, but she does not have justification for disbelieving not-P and believing P-or-Q. Rational commitment is a mere coherence requirement (Tucker 2013 and McGrath 2013a, 2013b).

iii. The Different Epistemic Status Approach

This approach aims to substantiate the thesis that if S is in a case of bad cognitive penetration, ordinarily S does not lack (prima facie) justification but some other epistemic status. Various epistemic statuses have been proposed.

A popular candidate is knowledge, or else warrant—namely, the additional property that a true belief needs to have in order to become knowledge (Tucker 2010 and Huemer 2013a). The no knowledge/warrant approach says that in bad cognitive penetration cases S does not lack justification. Rather, S possesses justification without having knowledge or warrant. For instance, S could have justification for believing P without her belief tracking the truth, or without her belief arising from a reliable belief-forming mechanism, or without her belief arising from a belief-forming mechanism that works properly (Huemer 2013a). This is what presumably happens in evil demon cases or Gettier-style scenarios (see Siegel 2013a for a formulation of cognitive penetration cases as Gettier cases). A general concern about this strategy stems from the mentioned impression that there are substantial differences between perceptual experiences badly cognitively penetrated and the perceptual experiences of a victim of a skeptical scenario or a Gettier-style scenario (Tucker 2010 and Markie 2013). In all these cases, the subject S basing her beliefs on her perceptual experiences lacks knowledge and warrant. Nevertheless, in bad cognitive penetration cases, S may also appear to be blameworthy for having her experiences in a way that the victim of a skeptical scenario or a Gettier-style scenario may not (Tucker 2010). If justification essentially depended on the absence of blameworthiness, the fact that S lacks knowledge or warrant in bad cognitive penetration cases would be redundant or insufficient to explain our intuitions.

To dispel this concern Tucker (2010) adduces the Weirdo thought experiment. Weirdo successfully begs a demon to turn himself into a victim of a skeptical scenario and erase this request from his memory. Tucker insists that it is intuitive that when Weirdo becomes a victim of a skeptical scenario, though he is blameworthy (or lacks blamelessness) for having his deceptive perceptual experiences and he lacks knowledge and warrant, Weirdo is nevertheless justified in his beliefs about the external world (Tucker 2010, 2011). This suggests that S’s being blameworthy (or lacking blamelessness) plays no role in determining whether S is justified in bad cognitive penetration cases (assuming that there is no principled distinction between Weirdo’s blameworthiness and S’s blameworthiness due to cognitive penetration).

To question the no knowledge/warrant approach, Markie (2013) uses a different thought experiment. Suppose a novice gold prospector and an expert are in the same skeptical scenario. The expert’s experience as if the nugget before him is gold is a non-penetrated perceptual experience or a case of good cognitive penetration (given the external stimuli provided by the demon), whereas the novice’s perceptual experience as if the nugget is gold is partly caused by his “wishful seeing,” so it is a case of bad cognitive penetration (see also Tucker 2010 and McGrath 2013b). Markie stresses that the novice’s epistemic status appears worse than the expert’s despite their both lacking knowledge and warrant due to the skeptical scenario. This suggests that what explains the intuitive inadequacy of the epistemic status of the novice, and the intuitive inadequacy of the epistemic status in any bad cognitive penetration case, must be something different from knowledge and warrant.

Tucker (2010) observes that Markie’s case does not necessarily show that bad cognitive penetration affects justification. He suggests that although both the novice and the expert in the skeptical scenario lack knowledge and warrant, what renders them different from an epistemic point of view is simply this: only the novice is epistemically blameworthy in having his experience. Tucker thus proposes a novel candidate for rescuing justification: a victim of bad cognitive penetration does not lack epistemic justification but epistemic blamelessness. She is both justified and blameworthy.

Epistemologists have considered appealing to the absence of other candidates to explain why bad cognitive penetration cases are epistemically defective; for instance: epistemically virtuous belief or proper function of the cognitive faculty (McGrath 2013b); positive evaluation of the subject’s cognitive character (Tucker 2013 drawing from Skene 2013); practical appropriateness of belief-formation (Fumerton 2013).

b. Externalist Concessive Solutions

Externalist reliabilists—like Lyons (2011, 2016) and Ghijsen (2016)—typically agree with concessive internalists (which we consider in Section 2.c) on the truth of the downgrade thesis (Teng 2016). The major point of departure of the concessive reliabilists from the concessive internalists regards the explanation of why prima facie justification is negatively affected by bad cognitive penetration. Concessive reliabilists offer a traditional externalist account, which adduces the unreliability of the processes that produce bad cognitive penetration.

Cognitive penetration is epistemically bad—when it is bad—because and when it cuts us off from the world around us, when it makes us less sensitive to our environments, when it makes us more likely to believe p whether or not p is actually true (Lyons 2016, p. 3).

Bad cognitive penetration of perceptual experience can be construed as a phenomenon that renders the process of belief-formation unreliable with respect to its statistically tracking the truth, or as a phenomenon that makes a perceptual experience as if P an inappropriate ground for S’s belief that P (see Lyons 2011, 2016).

The contemporary debate of cognitive penetration and epistemic justification typically presupposes that cognitive penetration may either worsen or enhance the epistemic status of perceptual experience (see Section 1.a). A virtue of concessive reliabilism is the illuminating explanation that it offers for distinguishing the cases of bad cognitive penetration from the cases of good cognitive penetration (Ghijsen 2016). According to Lyons (2011, 2016), whereas the cases of bad cognitive penetration are those that affect reliability negatively, the cases of good cognitive penetration are those that affect reliability positively. And this is so regardless of the penetrating states being a (justified or unjustified) belief or a non-doxastic state like a desire or a fear.

Another asserted virtue of the concessive reliabilist account is that it offers a unitary solution to the problem of cognitive penetration and the problem of why perceptual experiences can have or lack justificatory power when experience is unpenetrated. In particular, it explains the cases in which S is affected by bad cognitive penetration and the cases in which S is a victim of a skeptical scenario by claiming that both situations are essentially cases in which S’s belief-production processes are unreliable (Ghijsen 2016). As we see in Section 2.c, the responses to the cognitive penetration problem by concessive internalists do not offer unitary solutions of this type. One might adduce this consideration to argue that the reliabilist accounts are preferable (see Ghijsen 2016).

A way to question this reliabilist response to the cognitive penetrability problem is raising standard objections to reliabilism about justification (see Becker 2018). Moreover, Tucker (2014) has argued that this reliabilist response fares no better than internalist resolute solutions. Suppose S’s perceptual experience as if P is cognitively penetrated by her desire that P but P happens to be actually true most of the times when this cognitive penetration obtains. To accommodate suppositions of this type, reliabilists might need to bite the bullet and claim that the output-beliefs of such processes would be actually justified, though this may appear counterintuitive. In a similar fashion, resolute internalists insist that justification is safe from the threat of cognitive penetration. For further criticism see, for instance, Vahid (2014).

c. Internalist Concessive Solutions

This section surveys the principal internalist concessive solutions to the cognitive penetrability problem. As previously mentioned, these accounts accept the downgrade thesis and reject (PC), but they might be described as modifications of phenomenal conservatism that confine the existence of the justificatory power of perceptual experiences to particular circumstances: when certain enabling factors are present or some disabling factors are absent (Chudnoff 2019).

We first examine three versions of what Lyons (2016) calls inferentialism: Siegel’s process inferentialism, McGrath’s receptivity approach, and Markie’s knowledge-how account. Inferentialism rests on the assumption that the proper way to assess epistemically a perceptual experience of S (and S’s beliefs based on it) is checking the way in which S has produced the perceptual experience, roughly in the same way in which we epistemically assess a belief B of S by checking the way in which S has inferred B from other beliefs. A key assumption is that in any case of bad cognitive penetration, the epistemic status of the relevant experience is downgraded as a result of the experience having a rationally assessable etiology but failing to meet certain standards of epistemic rationality. Whether a perceptual experience has justificatory power thus depends on its causal history (Lyons 2011, 2016). Since the factors that determine S’s perceptual justification—the etiologies of S’s perceptual experiences—are thought of as mental processes of S which are possibly reflectively inaccessible to S, inferentialism is typically considered to be an internalist mentalist view (Lyons 2016).

At the end of this section we examine Chudnoff’s presentational conservatism, an internalist (partly) concessive account that does not qualify as inferentialist.

i. Process Inferentialism

Siegel (2013a, 2013b) maintains that a perceptual experience gets epistemically downgraded whenever it has a checkered past; namely, its etiology is similar with respect to its psychological elements to the etiology of a (possible) belief that has the same content and proves unjustified. Consider this example that draws an analogy between wishful seeing and wishful thinking. John’s wishfully seeing that Jack is angry consists of John’s visual experience as if Jack is angry, produced by an etiology involving cognitive penetration by John’s desire that Jack is angry. John’s experience has a checkered past because its etiology is similar with respect to its psychological elements to the etiology of an unjustified belief that Jack is angry, which John could have out of his wishful thinking.

Note that a cognitively penetrated perceptual experience may not have a checkered past. Nevertheless, all beliefs based on cognitively penetrated experiences with checkered past are ill-formed, and so unjustified (Siegel 2013a).

The internalist who—like Siegel—endorses the downgrade thesis must explain why a perceptual experience may lose its justificatory force because of cognitive penetration, but it does not when the subject is simply in a zap-like state. Siegel (2013a) maintains that the etiology of a perceptual experience when the subject is in a zap-like state results from an arational process, whereas the etiology of a perceptual experience badly cognitively penetrated results from a rationally assessable but irrational process. People might find it counterintuitive that these processes are rationally assessable. A process inferentialist may insist, however, that rationally assessable etiologies are those that lie within the cognitive system of the subject, whereas arational etiologies are external to the subject’s cognitive system. Another possibility is that rationally assessable etiologies are those on which the subject has some type of rational control, which is impossible in zap-like cases (Siegel 2012, 2013a).

Process inferentialism has further problems. It is to a good extent indeterminate, by this account, which etiologies of perceptual experiences are defective and why. For it is unclear in what precise respects and to what extent the etiologies of perceptual experiences should share similarity in structure with the etiologies of ill-formed beliefs to qualify as defective (Lyons 2016). Furthermore, although there are paradigmatic instances of ill-formed beliefs (for example, those based on wishful thinking or jumping to conclusions), the distinction between well-formed and ill-formed beliefs is not always clear-cut. So, the only way to draw these distinctions might ultimately be by relying on people’s intuitions, which might diverge (Siegel 2013a). If bad etiologies cannot be identified by means of an effective criterion, process inferentialism is ineffective in distinguishing good cognitive penetration cases from bad ones. If the only way to draw this distinction with precision were appealing to a reliabilist criterion, process inferentialism would not fulfill its internalist ambitions (Lyons 2016).

Another possible source of difficulty for process inferentialism turns on relevant dissimilarities between experiences and beliefs. All perceptual experiences possess—many epistemologists contend—a distinctive phenomenology capable of turning them into justification-providing states; but this phenomenology is not to be found in any belief. This might indicate that the features of the etiologies of perceptual experiences are irrelevant to their justificatory power, and that drawing epistemological conclusions from analogies between perceptual experiences and beliefs is ultimately misguiding (see Vance 2014 and Silins 2016).

For responses to these and other concerns, and an updated defense of process inferentialism, see Siegel (2017, 2018).

ii. The Receptivity Approach

McGrath’s (2013a, 2013b) receptivity approach puts emphasis on the relation between perceptual experiences and their bases. Beliefs can be based on other mental states. In this account, perceptual experiences can do so too. McGrath maintains that one’s seemings can produce other seemings in one’s mind, and draws a distinction between receptive and nonreceptive seemings. A receptive seeming is the input and a non-receptive seeming is the output of a quasi-inference—a process that constitutes the transition from one seeming to another. More precisely,

A transition from a seeming that P to a seeming that Q is “quasi-inferential” just in case the transition that would result from replacing these seemings with corresponding beliefs that P and Q would count as genuine inference by the person (McGrath 2013b, p. 237).

Receptive seemings are unconditional justification-providing states of a subject S, whereas non-receptive seemings give S justification only if the relevant quasi-inference is good. Receptive seemings are given to S, whereas non-receptive seemings arise after S’s own doing. The former seemings provide S with justification without being epistemically assessable. The latter seemings are epistemically assessable due to their stemming from S’s own making (McGrath 2013b).

A good quasi-inference can be characterized by a comparison with a good inference between beliefs. A good inference is one that results in a justified output-belief. Assuming for simplicity that only two beliefs participate in the inference, what is involved in a good inference is a transmission of justification from one belief to another. This happens only if the first belief is justified and sufficiently supports the second. Furthermore, a good inference requires for the subject S some sort of appropriate rationalization (which need not involve higher-order thinking or justification)—for example, S’s correct grasp of the epistemic relation of support between the two beliefs, S’s correct use of background information stored in S’s cognitive system as relevant knowledge-how, or a mix of these two. This rationalization would not be appropriate, for instance, if it depended on factors that would make S jump to conclusions, such as expectations, desires and moods (McGrath 2013b). Analogously, in a good quasi-inference between seemings, what is involved is the transmission of the property, which a seeming might possess or lack, of making S have justification for believing its content. Only receptive seemings have this property by default. In a good quasi-inference, the receptive seeming transmits this property to the non-receptive seeming. As a result, S can be justified in believing the content of the non-receptive seeming. Yet, if the non-receptive seeming is not sufficiently supported by the receptive seeming—because an output-belief with the content of the first seeming would not be sufficiently supported by an input-belief with the content of the second seeming—the non-receptive seeming does not receive the relevant epistemic property. In this case, the quasi-inference is not good, and S does not wind up having justification for believing the content of the non-receptive seeming (McGrath 2013a, 2013b).

The receptivity approach explains the downgrade of perceptual experience affected by bad cognitive penetration by adducing the features of a correlated quasi-inference: the downgrade happens when the quasi-inference is bad (McGrath 2013a, 2013b). Take Angry Jack. In the receptivity approach, Jill initially entertains a receptive seeming about Jack’s face that has the intrinsic property of giving Jill justification for believing that Jack is not angry. Under the influence of cognitive penetration by her unjustified belief that Jack is angry, this receptive seeming is replaced in Jill’s mind with a non-receptive seeming that Jack is angry. This is a bad quasi-inference because the receptive seeming does not support the non-receptive seeming, as belief in the content of the first would not support belief in the content of the second. Hence, Jill is not justified in believing that Jack is angry.

It is unclear whether this approach can accommodate a disunified view of perception—one that distinguishes between sensations (low-level and non-conceptual) and seemings (high-level and conceptual) (Lyons 2016). What McGrath calls non-receptive seemings are states with conceptual content—so proper seemings. However, McGrath seems to concede that receptive seemings are not necessarily states with conceptual content—they may be sensations. This means that, for McGrath, a perceptual experience may arise from a quasi-inference whose input—the receptive seeming—is constituted by mere sensations. Yet a quasi-inference requires all seemings involved to have believable contents, and thus conceptual contents (see Lyons 2016). Moreover, suppose that perception is actually disunified and that the proponent of the receptivity approach denies that mere sensations can be the inputs of quasi-inferences. They should conclude that, for example, the transition in Jill’s mind leading to her perceptual experience that Jack is angry is not a quasi-inference. A consequence would be that this perceptual experience would be a receptive seeming, and thus a justification-provider. Many would find this counterintuitive (see McGrath 2013b and Lyons 2016).

Another concern is that the receptivity approach does not address what might actually be at stake in cases of bad cognitive penetration: the cognitive penetration of receptive seemings, rather than non-receptive seemings (Lyons 2016). Take again Angry Jack. Suppose the correct description of what happens is this: because of her unjustified belief that Jack is angry, Jill has a cognitively penetrated receptive seeming that Jack’s face has anger features. This receptive seeming produces in Jill’s mind, via a quasi-inference, a non-receptive seeming that Jack is angry. If this were the correct description of what happens in Angry Jack, the proponents of the receptivity approach should conclude that Jill is justified in believing that Jack is angry on the basis of her non-receptive seeming that Jack is angry. For this non-receptive seeming is actually supported by Jill’s receptive seeming that Jack’s face has anger features.

Lyons (2016) complains that the receptivity approach treats cognitively penetrated non-receptive seemings as personlevel phenomena, though it is intuitive that perceptual experiences do not result from our own doing. According to Lyons, transitions between seemings cannot be controlled by the subject and could at best be thought of as produced by unconscious inferential mechanisms—this would explain the impression that all seemings are given to us. Advocates of the receptivity approach might concede that all seeming-to-seeming transitions are produced by sub-personal mechanisms. An unpalatable consequence for the receptivity approach (which claims that all seemings produced by sub-personal mechanisms are receptive seemings) would be, however, that all seemings should be thought of as receptive, and thus as always capable of conferring prima facie justification.

Ghijsen (2016) notes that it is hard to find a coherent characterization of the background knowledge that the subject must have to carry out good quasi-inferences. Suppose the background knowledge required to appropriately rationalize the transition from a receptive seeming that this nugget is yellowish in a given way F to a non-receptive seeming that this nugget is gold is the propositional knowledge that whatever looks yellowish in a way F is gold. How could this knowledge be acquired by the subject? It could not be acquired via quasi-inferences from receptive seemings of objects looking yellowish in a way F to non-receptive seemings of objects looking gold. For these quasi-inferences presuppose the background knowledge that we want to characterize. If this background knowledge were conceived of in terms of knowledge-how, it would have better prospects for helping. However, what exactly would this knowledge-how consist of? If this account is meant to be internalist, it cannot coincide with the subject’s mere ability to reliably recognize gold when she comes across it. Thus, the problem remains open.

iii. The Knowledge-How Account

The last inferentialist account we survey, developed by Markie (2013), holds that S’s perceptual experience as if P is epistemically appropriate—namely, it provides S with prima facie justification for believing P—if it results from S’s knowledgehow about the proposition that P. This knowledge-how consists of S’s being disposed to have the perceptual experience as if P in response to S’s attending to particular features of her overall experience and S’s being disposed to do so in virtue of her having background knowledgethat these particular features of her experience indicate that P is true (Markie 2013). Consider an expert orthopedic who has a perceptual experience as if (P) the X-ray shows a knee suffering from osteochondritis. The experience provides the orthopedic with prima facie justification for believing P, for the experience is epistemically appropriate. This is so because the experience results from her knowledge-how about P. This knowledge-how involves both her being disposed to entertain that specific perceptual experience in response to her attending to the particular features of her overall experience, and her having that disposition in virtue of having background knowledge that these particular features of her experience indicate that P is true.

More accurately, Markie analyzes S’s knowing-how as being constituted by (i) S’s disposition to have a perceptual experience as if P after her shift of attention to relevant features of her overall experience, (ii) S’s possession of background information that anything displaying those features is appropriately connected in some factual sense with P (for example, background evidence or justification that any object provided with these features is actually gold), and (iii) the character of S’s disposition being at least partly determined by S’s background information.

For Markie, S’s knowledge-how about P need not be accompanied by S’s reliable practice. (In the evil demon scenarios, the expert knows how to identify gold, though he fails to identify it reliably.) Furthermore, even when S’s practice is reliable, this alone does not provide S with the relevant knowledge-how. S’s reliable practice must be accompanied with S’s understanding that the right object or type of object (for example gold) has been identified by her.

Markie himself acknowledges that both the method of S’s acquiring the relevant knowledge-that and the latter’s relationship with S’s knowledge-how require further specification. One might also doubt that knowledge-how always coexists with knowledge-that, and that knowledge-how depends on knowledge-that in case of coexistence (Lyons 2016). Furthermore, the knowledge-how account of cognitive penetration is afflicted by a problem analogous to one that affects McGrath’s. Markie’s account requires all epistemically appropriate perceptual experiences to depend on S’s understanding and doing. For it is S’s knowledge-that which determines S’s disposition to form appropriate perceptual experiences in response to given features of her experience. But this knowledge-that is an agent-level factor. So, the knowledge-how account holds that the formation of appropriate perceptual experiences happens at personal level, which is implausible (Lyons 2016).

Another difficulty of McGrath’s receptivity account seems to afflict also the knowledge-how account. Markie’s account might not address what is really at stake in cases of bad cognitive penetration. For bad cognitive penetration might directly affect the features of S’s experience that S attends to and in response to which she forms her perceptual experiences (Lyons 2016). Markie considers this criticism and bites the bullet: for him, if cognitive penetration directly affected these features, S’s experiences would still be capable of conferring justification, provided they were produced through the exercise of S’s relevant knowing-how.

iv. Presentational Conservatism

Chudnoff’s (2019) presentational conservatism is a restrained version of phenomenal conservatism that is both accessibilist and mentalist. Presentational conservatism imposes the following additional condition necessary for a perceptual experience to supply immediate justification: the experience must have a presentational phenomenology.

Suppose you see a picture of a dog with an occluded middle part. Your perceptual experience is presentational with respect to the left part of the dog, its right part, but not with respect to the middle part of the dog. This is so even though the middle part of the dog is somehow represented in the picture.

According to Presentational Conservatism it is only those contents with respect to which an experience has presentational phenomenology that prima facie justifies on its own, that is, immediately. If it justifies other contents, then it does so mediately. That the justification is mediate does not mean that it is remote or difficult to attain. Your experience of the partly occluded dog, for example, justifies you in believing various things about the dog’s middle both because they are made likely by the propositions about the dog’s rightward and leftward parts that it immediately justifies, and even entailed by some of the propositions about the whole dog that it immediately justifies (Chudnoff 2019, p. 6).

Chudnoff suggests three different ways in which presentational conservatism can account for cases of bad cognitive penetration, depending on what proposition is taken to be the target and what part of one’s experience cognitive penetration is taken to affect. Chudnoff focuses on the Angry Jack example. Let us consider all three accounts in turn.

Here is the first. Consider the proposition (a) Jack’s eyes and mouth are neutrally shaped, and the proposition (b) Jack is angry.

Jill’s experience immediately justifies her in believing (a) because it is both represented and presented; Jill’s experience doesn’t immediately justify her in believing (b) because though represented it isn’t presented; Jill’s experience would mediately justify her in believing (b) if she had reason to think that if (a) is true then (b) is true; but she doesn’t; so it doesn’t (Chudnoff 2019, p. 10).

Chudnoff suggests that Jill’s experience does not have presentational phenomenology with respect to (b) because anger is a mental state and, as such, is invisible. So, it cannot presentationally seem to Jill that Jack is angry

This account could be extended to other cases of cognitive penetration in which the penetrated perceptual experience results in a mental state without presentational phenomenology. In all these cases the perceptual experiences would be downgraded (see Brogaard 2018 for a similar strategy).

Epistemologists and philosophers of mind who believe that high-level properties are genuinely presented in our experiences might deny, however, that Jill’s experience that Jack is angry does not have presentational phenomenology. These philosophers might raise similar objections to analogous accounts of experience downgrade. This exposes a general weakness of presentational conservatism: since it is somewhat controversial what things and features can genuinely be presented in perceptual experience (Siegel 2016), if presentational conservatism is endorsed, it becomes equally controversial what sort of beliefs can be immediately justified by our perceptual experiences.

This is Chudnoff’s second explanation. Consider again proposition (a) and the proposition (c) Jack’s eyes and mouth express anger. Chudnoff thinks that although anger is not visible, one can see facial organs expressing anger. Facial organs expressing anger is something that can presentationally seem to one to be the case. By these lights, a presentational conservative can claim that Jill’s experience has presentational phenomenology with respect to both (a) and (c). Hence,

Jill’s experience immediately justifies her in believing (a) because it is both represented and presented; Jill’s experience immediately justifies her in believing [c] because it is both represented and presented; but Jill’s justification for believing (a) defeats Jill’s justification for believing [c] because she knows that if (a) is true, then [c] is not true . . . Though Jill’s experience prima facie justifies her in believing that Jack’s eyes and mouth express anger, all things considered Jill does not have justification for believing that Jack’s eyes and mouth express anger because she has justification for thinking that Jack’s eyes are horizontal, as is his mouth and she knows that horizontal eyes and mouth do not express anger (Chudnoff 2019, pp. 10–11).

What is affected in this case is only all things considered justification. Chudnoff suggests that the justification for (a) defeats the justification for (c), and not the other way around because Jill’s experience has stronger presentational phenomenology with respect to (a). Had Jill’s experience stronger presentational phenomenology with respect to (c), the justification for (c) would defeat that for (a).

Both explanations above assume that cognitive penetration does not change Jill’s experience with respect to the low-level neutral characteristics of Jack’s face. Chudnoff’s third explanation assumes that cognitive penetration causes Jill’s experience of Jack to have low-level, angry-face features. Chudnoff acknowledges that in this case Jill’s experience would have presentational phenomenology with respect to the proposition that the features of Jack’s face express anger. Therefore, her perceptual experience would provide immediate justification for (c) and, indirectly, for (b). Some epistemologists would find this result counterintuitive.

d. Other Options

This section presents four miscellaneous responses to the epistemic problem of cognitive penetrability that do not clearly fit the internalism-externalism dichotomy.

i. Sensible Dogmatism

Brogaard’s (2013) sensible dogmatism holds that experiences are mere collections of sensory impressions. Brogaard calls phenomenal contents of an experience the sensory impressions that constitute the experience. Furthermore, Brogaard calls phenomenal seemings the “interpretations” of experiences—that is to say, the conceptual or propositional ingredients of perception.

Sensible dogmatism is a special version of phenomenal conservatism that implies the downgrade thesis. This is its core principle:

If it seems to S as if [P] and the seeming is grounded in the content of S’s . . . experience, then, in the absence of defeaters, S thereby has at least some degree of justification for believing that [P] (Brogaard 2013, p. 278).

S’s seeming that P is grounded in a phenomenal content Q of an experience E that S has just in case (i) reliably, if Q is a content of S’s experience E, it seems to S as if P and (ii) reliably, if it seems to S as if P, P is true. (i) can be understood as: in most ‘hypothetical situations’ closest to the actual one in which Q is a content of S’s experience E, it seems to S as if P. (ii) prevents seemings from being grounded in the content of experiences by ‘sheer’ coincidence. (ii) does not require P to be actually true; it just requires P to be true in most of the closest ‘hypothetical situations’ where S has the seeming that P (Brogaard 2013).

Sensible dogmatism can explain the novice prospector case as follows: the novice is not justified in his belief that P because (i) is not met. Since the desire to find gold is not present in most of the closest possible situations where the novice has the same sensory experience of the pebble, this sensory experience does not lead him, in those situations, to have a seeming that the pebble is gold (Brogaard 2013). Another way in which sensible dogmatism can explain the novice case is this: suppose the novice’s desire is present in most or all of the closest possible situations where he has the sensory experience of the pebble, leading him to having the seeming that the pebble is gold even in cases where it is not so. Then, (ii) is not satisfied. For the content of his seeming that the pebble is gold would not be true in most of the closest possible situations where it would seem to him that the pebble is gold (Brogaard 2013). In conclusion, since the novice’s seeming that this pebble is gold is not grounded in the content of his experience, his seeming does not justify his belief that the pebble is gold. It is easy to see, on the other hand, that the expert prospector’s seeming is grounded in the content of his own experience, so this seeming justifies his belief (Brogaard 2013).

Given the reliabilist component of Brogaard’s position, sensible dogmatism appears to be an externalist view. Yet Brogaard insists that it is a weak internalist position, for the mental states that provide S with justification are accessible to S, though the factors that determine whether those mental states are justification-providing are not.

The reliabilist components of Brogaard’s position make it inherit problems from externalist reliabilism. Think for instance of the consequence of sensible dogmatism that the seemings of a Matrix’s victim would not provide her with justification because (ii) would not be met in the Matrix scenario (see Vahid 2014).

ii. The Imagining Account

Teng’s (2016) imagining account bases her defense of the downgrade thesis on a possible psychological explanation of how cognitive penetration is produced in a subject S presented in Macpherson (2012). Suppose S entertains a perceptual experience. According to Macpherson, one possible cognitive-penetration-causing mechanism involves the interaction of imagination and perceptual experience. In particular, it involves (i) the production of an imaginative experience by some mental state of S, and (ii) the interaction of this imaginative experience with S’s perceptual experience. The upshot is a novel phenomenal state of S with both the perceptual experience and the imaginative experience as contributors. As Teng emphasizes, since imaginative experiences are experiences in a sense fabricated by S, the phenomenal states resulting from a combination of an imaginative experience and a perceptual experience of S are to be considered to be partly fabricated by S as well. Cognitively penetrated experiences could be states of this type.

Teng finds it intuitive that an experience of S supplies S with prima facie justification for believing its content only if S does not fabricate (consciously or unconsciously) the experience. She infers from this that no imaginative experience of S could be a prima facie justification-provider. Teng concludes that since any cognitively penetrated experience of S is partly fabricated by S, it must be epistemically downgraded with respect to the fabricated part (Teng 2016).

A potential difficulty of this account concerns the explanation of the cases of good cognitive penetration. Teng submits that these cases might be explained by mere attentional shifts of S involving no imagining and capable of rendering certain objective features of the world more salient to S. She also suggests that S’s imagining might explain some specific cases of good cognitive penetration. For imagining could occasionally facilitate the perception of independent reality rather than interfering with it. Consider for instance the following experiment:

J. Farah (1985 and 1989) asked her participants to detect the presence of a faint letter H or T in a square while the participants projected a mental image of H or T onto the same location. It turned out that their detection was more accurate when they were imagining the same letter than a different one (Teng 2016, p. 25).

iii. The Analogy with Emotions

Vance’s (2014) account explains why a perceptual experience can be downgraded by its inappropriate etiology through drawing an analogy between cognitively penetrated experiences and cognitively penetrated emotional states.

Suppose S has an unjustified background belief that all foreigners are dangerous. One day S meets some foreigners and her background belief causes S to feel fear. Had she not had her unjustified belief, she would not have felt fear. On the basis of her fear, she forms the belief that the people before her are dangerous. Her fear is in this case downgraded: it cannot provide justification for her belief that those people are dangerous because it is grounded in a belief constituting a defective reason for her feeling. When emotions are grounded in such a defective way, their justificatory power decreases or ceases (Vance 2014). An emotional state with an etiology starting with a non-defective reason for the emotion could nevertheless be a justification-provider. For instance, S’s fear of a snake that S spots in her trail caused by her justified background belief that snakes are dangerous can provide S with justification for believing that walking on the trail is unsafe (Vance 2014).

Vance stresses that emotional states and perceptual experiences share extrinsic properties—such as psychological and epistemic features of their etiological structure—and intrinsic properties—such as their intentional character and distinctive phenomenology. From this, he derives that perceptual experiences, as well as emotions, can be downgraded with respect to their justificatory power. He submits that, in analogy with emotional states, this typically happens when perceptual experiences are grounded in unjustified beliefs.

A possible criticism of Vance’s account is that it is controversial whether the similarities between emotions and experiences could outweigh their differences in such a way that they both turn out to be rationally assessable states and in a similar way (Silins 2016).

iv. The Sensorimotor Theory of Perception

Vahid’s (2014) account of the cognitive penetrability problem and defense of the downgrade thesis rely on a conception of perceptual experience different from the traditional ones that conceive of perception as something given to us. Vahid’s conception is part of the extended cognition view of mental processes, which maintains that mental processes are partly constituted by environmental components situated out of the subject’s body. Think of Otto—a memory-impaired man—who uses his notebook to take notes that help him remember things he would otherwise forget. Otto’s cognition can be said to have been extended to his notebook.

While, on the received view, the notebook is not part of Otto’s cognitive processes, [the extended cognition thesis] takes Otto and his notebook to form a cognitive system where the information stored in the notebook functions as Otto’s non-occurrent, dis-positional beliefs. Cognitive processes are not, thus . . . purely in the head (Vahid 2014, p. 453).

Similarly, perceptual experiences may not be only in the subject S’s head. The sensorimotor theory of perception—one of the extended perception accounts—turns on the thought that perceptual experience is not just produced by S’s brain processes but is constituted by the ways in which these processes enable S to interact with her environment. In this account, S’s perceptual experience depends on both the features of S’s perceptual apparatus and those of the world to which this apparatus is sensitive.

[W]hen looking at a red apple, the sensation of seeing the apple . . . merely consists in our understanding or knowledge of a class of relevant counterfactuals, e.g., that if one were to move one’s eyes or body with respect to the apple, the sensory signals change in a way characteristic of red, rather than green, apples. One’s experience of seeing a red apple just is the knowledge of the class of the relevant sensorimotor contingencies (Vahid 2014, pp. 454–455).

In this view, perceptual experiences result from S’s expectations, assumptions, suppositions, understanding or implicit knowledge about what would happen in terms of new inputs from the world if S interacted in specific ways with the things the perceptual experiences are about (see Vahid 2014). (This theory is closely related to a model of the mind called predictive coding—see Hohwy 2012 and Clark 2013.)

To understand Vahid’s account of the cognitive penetrability problem, let us go back to Expert and Novice and Angry Jack. Vahid maintains that only the expert has implicit knowledge of the counterfactuals describing the perceptual consequences of his interaction with the nugget—or, at least, that the expert’s knowledge of them is more thorough than the novice’s. So, when faced with a gold nugget, the two prospectors actually have different cognitively penetrated experiences. For the expert’s experience is constituted by more numerous and detailed perceptual expectations than those of the novice’s experience. This enables us to distinguish the good cognitive penetration of the expert’s perceptual experience and the bad cognitive penetration of the novice’s perceptual experience. Angry Jack is interpretable along similar lines. Jill’s initial unjustified belief that Jack is angry penetrates her experience of Jack’s face by producing in Jill all the typical perceptual expectations that constitute perception of anger. In this case, we can say that Jill’s perceptual experience is badly penetrated because most of her expectations are mistaken (Vahid 2014).

Why is the novice’s belief that the nugget is gold not justified by his perceptual experience? And why is Jill’s belief that Jack is angry not justified by her experience? To answer these questions Vahid appeals to an explanationist conception of epistemic justification according to which a proposition is justified as long as it is the best available explanation of the subject’s evidence.

In the version of the angry-looking Jack case . . . the truth of Jill’s belief is not the best explanation of her incorrect expectations and assumptions that constitute her experience of seeing Jack’s face. Only correct expectations and suppositions reflect the facts about the external world . . . Likewise, in the gold-digging case, the truth of the novice’s belief that the pebble is gold is not the best explanation of his (thin) class of sensorimotor knowledge constituting his output experience as [the] less complex and simpler hypothesis [that the novice does desire to find gold] can discharge this function (Vahid 2014, p. 457).

See Ghijsen (2018) and Macpherson (2017) for discussion and criticism.

3. Conclusion

This article has provided an introductory map to the contemporary debate on the problem of cognitive penetrability of perception for epistemic justification. Internalist accessibilists typically do not concede that justification is hostage to cognitive penetration and put forward resolute responses to the cognitive penetration problem. On the other hand, externalist reliabilists together with some internalists from the mentalist camp concede that cognitive penetration may affect justification negatively and attempt to provide explanations of why and how this can happen. There are a few alternative accounts of the cognitive penetration problem that cannot easily be classified within the internalism-externalism framework.

4. References and Further Reading

  • Audi, Robert. 1993. The Structure of Justification. Cambridge: Cambridge University Press.
  • Becker, Kelly. 2018. “Reliabilism.” Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/reliabil/
  • Brogaard, Berit. 2013. “Phenomenal Seemings and Sensible Dogmatism.” In Chris Tucker (ed.), Seemings and Justification. Oxford: Oxford University Press.
  • Brogaard, Berit. 2018. “Bias-Driven Attention, Cognitive Penetration and Epistemic Downgrading.” In Christoph Limbeck and Friedrich Stadler (eds.), Philosophy of Perception. Publications of the Austrian Ludwig Wittgenstein Society. De Gruyter.
  • Chudnoff, Elijah. 2019. “Experience and Epistemic Structure: Can Cognitive Penetration Result in Epistemic Downgrade?” https://philpapers.org/archive/CHUEAE-2.pdf (accessed on 1/5/2019).
  • Clark, Andy. 2013. “Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science.” Behavioral and Brain Sciences 36: 3, 181–204.
  • Connolly, Kevin. 2017. “Perceptual Learning”. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/sum2017/entries/perceptual-learning/.
  • Farah, Martha J. 1985. “Psychophysical Evidence for a Shared Representational Medium for Mental Images and Percepts.” Journal of Experimental Psychology 114:1, 91–103
  • Farah, Martha J. 1989. “Mechanisms of Imagery-Perception Interaction.” Journal of Experimental Psychology: Human Perception and Performance 15:2 pp. 203–211.
  • Fumerton, Richard. “Siegel on the Epistemic Impact of “Checkered” Experience.” Philosophical Studies 162:3, 733–739
  • Ghijsen, Harmen. 2016. “The Real Epistemic Problem of Cognitive Penetration.” Philosophical Studies 173:6, 1457–1475
  • Ghijsen, Harmen. 2018. “Predictive processing and foundationalism about perception.” Synthese. Open access. https://doi.org/10.1007/s11229-018-1715-x
  • Goldman, Alvin I. 1979. “What is Justified Belief?” In George Pappas (ed.), Justification and Knowledge. Dordrecht: Reidel.
  • Hansen, Thorsten., Olkkonen, Maria., Walter, Sebastian and Gegenfurtner, Karl R. 2006. “Memory Modulates Color Appearance.” Nature Neuroscience 9:11, 1367–1368.
  • Hohwy, Jakob. 2013. The Predictive Mind. Oxford: Oxford University Press.
  • Huemer, Michael. 2001. Skepticism and the veil of perception. Lanham, MD: Rowman and Littlefield.
  • Huemer, Michael. 2007. “Compassionate Phenomenal Conservatism.” Philosophy and Phenomenological Research 74:1, 30–55.
  • Huemer, Michael. 2013a. “Epistemological Asymmetries between Belief and Experience.” Philosophical Studies 162:3, 741–748.
  • Huemer, Michael. 2013b. “Phenomenal Conservatism Über Alles.” In Chris Tucker (ed.), Seemings and Justification. Oxford: Oxford University Press.
  • Lyons, Jack C. 2011. “Circularity, reliability, and the cognitive penetrability of perception.” Philosophical Issues 21, 289–311.
  • Lyons, Jack C. 2016. “Inferentialism and Cognitive Penetration of Perception.” Episteme 13:1, 1–28
  • Macpherson, Fiona. 2012. “Cognitive Penetration of Color Experience: Rethinking the Issue in Light of an Indirect Mechanism.” Philosophy and Phenomenological Research 84:1, 24–62.
  • Macpherson, Fiona. 2017. “The relationship between cognitive penetration and predictive coding.” Consciousness and Cognition 47, 6–16
  • Markie, Peter J. 2005. “The mystery of direct perceptual justification.” Philosophical Studies 126, 347–373.
  • Markie, Peter J. 2006. “Epistemically appropriate perceptual belief.” Noûs 40:1, 118–142.
  • Markie, Peter J. 2013. “Searching for true dogmatism.” In Chris Tucker (ed.), Seemings and Justification. Oxford: Oxford University Press.
  • McGrath, Matthew. 2013a. “Siegel and the Impact for Epistemological Internalism.” Philosophical Studies. 162, 723–732
  • McGrath, Matthew. 2013b. “Phenomenal Conservatism and Cognitive Penetration: The “Bad Basis” Counterexamples.” In Chris Tucker (ed.), Seemings and Justification. Oxford: Oxford University Press.
  • Pappas, George. 2014. “Internalist vs. Externalist Conceptions of Epistemic Justification.” In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/fall2017/entries/justep-intext/.
  • Payne, Keith B. 2001. “Prejudice and Perception: The Role of Automatic and Controlled Processes in Misperceiving a Weapon.” Journal of Personality and Social Psychology 81:2, 181.
  • Poston, Ted. 2018. “Internalism and Externalism in Epistemology.” Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/int-ext/
  • Pryor, James. 2000. “The Skeptic and the Dogmatist.” Noûs 34:4, 517–549.
  • Raftopoulos, Athanassios and Zeimbekis, John. 2015. “Cognitive Penetrability of Perception: An Overview.” In Athanassios Raftopoulos and John Zeimbekis (eds.), The Cognitive Penetrability of Perception: New Philosophical Perspectives. Oxford University Press.
  • Siegel, Susanna. 2012. “Cognitive penetrability and perceptual justification.” Noûs 46, 201 –22.
  • Siegel, Susanna. 2013a. “The Epistemic Impact of the Etiology of Experience.” Philosophical Studies 162:3, 697–722.
  • Siegel, Susanna. 2013b. “Reply to Fumerton, Huemer, and McGrath.” Philosophical Studies 162:3, 749–757
  • Siegel, Susanna. 2015. “Epistemic Evaluability and Perceptual Farce.” In Athanassios Raftopoulos and John Zeimbekis (eds.), The Cognitive Penetrability of Perception: New Philosophical Perspectives. Oxford University Press.
  • Siegel, Susanna. 2016. “The Contents of Perceptions.” In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/win2016/entries/perception-contents/.
  • Siegel, Susanna. 2017. The Rationality of Perception. New York: Oxford University Press.
  • Siegel, Susanna. 2018. “Can Perceptual Experiences Be Rational?” Analytic Philosophy 59:1, 149–174
  • Silins, Nicholas. 2016. “Cognitive Penetration and the Epistemology of Perception.” Philosophy Compass 11:1, 24–42
  • Skene, Matthew. 2013. “Seemings and the Possibility of Epistemic Justification.” Philosophical Studies 163: 539–59.
  • Stokes, Dustin. 2012. “Perceiving and desiring: a new look at the cognitive penetrability of experience.” Philosophical Studies 158: 3, 477–492.
  • Stokes, Mark B. and Payne, Keith B. 2011. Mental control and visual illusions: Errors of action and construal in race-based weapon misidentification. The Science of Social Vision: The Science of Social Vision 7: 275–295.
  • Teng, Lu. 2016. “Cognitive Penetration, Imagining, and the Downgrade Thesis.” Philosophical Topics 44:2, 405–426.
  • Tucker, Chris. 2010. “Why open-minded people should endorse dogmatism.” Philosophical Perspectives 24:1, 529–545.
  • Tucker, Chris. 2011. “Phenomenal Conservatism and Evidentialism in Religious Epistemology.” In Kelly James Clark and Raymond J. VanArragon (eds.), Evidence and Religious Belief. Oxford: Oxford University Press.
  • Tucker, Chris. 2013. “Seemings and Justification: An Introduction.” In Chris Tucker (ed.), Seemings and Justification. Oxford: Oxford University Press.
  • Tucker, Chris. 2014. “If Dogmatists Have a Problem with Cognitive Penetration, You Do Too.” Dialectica 68:1, 35–62.
  • Tucker, Chris. 2019. “Dogmatism and the epistemology of covert selection.” https://philpapers.org/rec/TUCDAT (accessed on 1/5/2019)
  • Vahid, Hamid. 2014. “Cognitive penetration, the Downgrade Principle, and Extended Cognition.” Philosophical Issues 24:1, 439–459.
  • Vance, Jonna. 2014. “Emotion and the new epistemic challenge from cognitive penetrability.” Philosophical Studies 169:2, 257–283.

Author Information

Christos Georgakakis
Email: c.georgakakis@abdn.ac.uk
University of Aberdeen
United Kingdom

and

Luca Moretti
Email: l.moretti@abdn.ac.uk
University of Aberdeen
United Kingdom

Simone Weil (1909—1943)

Weil photoThe French philosopher Simone Weil is a confronting and disconcerting figure in modern philosophy. This is not simply because she was so many things at once—ascetic and mystic, teacher and factory worker, labour activist and political militant, social thinker and piercing moral psychologist, critical Marxist and heterodox Christian theologian—but because of the striking “untimeliness” of her thought. For unlike philosophers in the analytic tradition, she insisted that life and philosophical reflection are connected on the deepest ethical level; and, unlike those in the postmodern tradition, she felt free to draw on terms like “truth,” “reality,” “the sacred,” “justice,” “soul,” and “God.”

Weil, of course, was not an analytic philosopher, nor a proto-postmodernist. She came to philosophy in the interwar years in a philosophical milieu of political radicalism, phenomenology, and emerging existentialism. As did most of her contemporaries, she saw philosophy in terms of the nature and challenges of the human condition, though she differed from the existentialists as to what this meant.

Whereas Jean-Paul Sartre and Simone de Beauvoir saw things in terms of the individual’s radical freedom to choose their values in a Godless world, Weil took a different path. Her concern was not to perfect herself as a replacement God figure, creating values out of a supposed absolute freedom, but to face up to, to have attended to, the real existence of other people. Whereas the existentialism of Sartre saw him faced with the challenge of showing how morality was even possible, Weil took the possibility of morality as a given—as an essential and fundamental modality of human life and experience, however partial and flawed its manifestations—and sought to show what it was to take morality seriously.
Taken that way, moral life rested on our capacity to care for others, where this meant to care for them as they were, and not as a means or obstacle to any end of our own, even that of our moral perfection or virtue. To refuse this attention was to read the world so that nothing and no-one was sacred, not even oneself. This reading gave us the world of power and so the sovereignty of force, and it was the ultimate logic of force “that it turn[ed] anyone subject to it into a thing.”
Such a reading of the world denied the ethical, yet equally it was precisely this denial the ethical sought to overcome. Here, for Weil, was a fundamental contradiction at the heart of ethical life. It was not a contradiction that meant the impossibility of that life, rather it showed us that the ethical was, ultimately, and at its foundations, something supernatural.
This article looks at Weil as a moral philosopher in a tradition that runs through Plato to Kant: one who took morality with a seriousness, with an utter commitment, alien to those philosophers tempted by scepticism or, in reaction, by a desire to find some rational foundation on which to securely rest an otherwise threatened edifice.

Table of Contents

  1. Life
  2. Writings
  3. Suffering, Oppression, Liberty
  4. Affliction, Detachment, the Impersonal, and the Sacred
  5. Uprootedness and the Needs of the Soul
  6. The Moral Ground
  7. References and Further Reading
    1. Primary
    2. Biographical
    3. Secondary

1. Life

Simone Weil was born in Paris on February 3, 1909, the second of two children born to comfortably off agnostic and secular Jewish parents. Her father was a medical doctor, and her brother, the 3-year older Andre, would become one of the most renowned mathematicians of the 20th century.

From the start Weil was both intellectually precocious and morally disconcerting. The intellectual capacity ran in the family (indeed, at 14, Weil would have a personal crisis in the face of what she considered her brother’s far superior abilities), but the moral sensitivity was her own and showed itself in various ways (for instance, refusing at age 5 to accept a necklace as a present on grounds of the discriminatory nature of luxury, and the very next year refusing to eat more sugar than that allotted to French troops as they battled the Germans).

She was educated at a number of schools and by private tutors before attending the Lycée Henry-IV as a pupil of the greatest philosophy teacher of the period, Émile Chartier (“Alain”). In 1928, and at her second attempt, she gained admission to the Ecole Normale Superieure, beating Simone de Beauvoir into second place in the Exam for General Philosophy and Logic. She studied philosophy there, graduating in 1931 with a diplome d’etudes superieures on the basis of her thesis “Science et perfection dans Descartes.” The same year, she passed the French Civil Service Examination (the agregation) and was appointed to a girls’ secondary school in the regional centre Le Puy, where she taught until 1936, with many breaks to pursue union activities, investigate Communist labour organizations in Germany, and fight on the Republican side in the Spanish Civil War.

After burning her foot badly stepping into a camouflaged pot of hot cooking oil, she left Spain and spent time in Portugal, then Italy, where she had her first mystical experiences.

The outbreak of World War II saw her in Paris, then, after the German invasion, in Marseille, publishing essays and doing what she could for those, often Jews like herself, seeking escape from Vichy France and the Nazi threat. In 1942, she accompanied her parents first to Morocco, then to New York, though she herself, determined to contribute to the Free French cause, soon returned to Europe, now to London. Weakened by inadequate nutrition and anguish, she died of tuberculosis on the evening of August 24, 1943, and, while not a baptised Catholic, was buried in a pauper’s grave in the Catholic Section of Bybrook Cemetery in Ashford, Kent.

2. Writings

Weil’s writings (collected now in 20 volumes) were produced in a mere 15 years. Much—including much of that which is most widely known—was published posthumously. Most of the work published in her lifetime was in the form of short essays for small political and literary journals, addressed to particular audiences. Such writings form only a small part of her collected work.

During her short life, she was most widely known as a political writer of the Left, an unorthodox and critical Marxist. Her most important work in this genre (though unpublished until 1955) was Reflections Concerning the Causes of Liberty and Social Oppression (1934). Around 1935, and especially after her first mystical experience in 1937, her writings took what many believed to be a new, religious direction. These writings, essays, notebooks, and letters she entrusted to the lay Catholic theologian Gustave Thibon in 1942, when, with her parents, she fled France. With the editorial help of Weil’s spiritual consultant (and sparring partner) Fr. Perrin, selections of these writings first made Weil widely known in the Anglo-American world. The serious effort for a complete publication of all Weil’s writings was largely the result of Albert Camus’ discovery of Weil’s writings while an editor at Gallimard (in 1951, he called her “the only great mind of our time.”) In 1988, Gallimard completed publication of her writings.

3. Suffering, Oppression, Liberty

In Memoirs of a Dutiful Daughter, de Beauvoir reports her first and perhaps only personal interaction with Weil in, most likely, 1929. “A great famine had just begun to devastate China,” she writes, and:

I was told on hearing the news she [Weil] had wept; these tears commanded my respect even more than her philosophical talents. I envied her for having a heart that could beat right across the world. One day I managed to approach her. I don’t remember how the conversation began; she declared in no uncertain terms that one thing alone mattered in the world today: the Revolution that would feed all the people on earth. I retorted, no less peremptorily, that the problem was not to make men happy, but to find a meaning for their existence. She looked me up and down: “It is easy to see you have never gone hungry,” she said. Our relationship stopped there. (239)

In this small exchange we see much of that which would shape Weil’s thought. What was basic for human life, and so a philosophy that dealt with the concerns of such a life, was not a quest for meaning, but rather a search for sustenance, for food. The food required was, in the end, both physical and spiritual, for there were needs of the body and needs of the soul. First there was, however, the need for physical sustenance. It followed that the primordial caring constitutive of the ethical must look always and first to the physical needs of other human beings. “It is an eternal obligation toward the human being not to let him suffer from hunger when one has a chance of coming to his assistance.”

This eternal obligation (eternal because constitutive) placed us as human beings into a shared community of mutual obligations.

For the early Weil, this eternal ethical obligation seemed, as it did at the time to many others, to be clearly and equally a political obligation (“revolution”). The task was to comprehend and, so far as possible, to deliver a social order that, because it enabled us to attend to the material needs of others, allowed those needs to be met.

It was here she found Karl Marx essential. “Marx’s truly great idea,” she wrote, was “that in human society as well as in nature nothing takes place otherwise than through material transformations.” It followed that to effectively meet our fundamental obligation required we uncover “the material conditions which determine our possibilities of action… conditions… defined by the way in which man obeys material necessities in supplying his own needs, in other words, by the method of production.”

For Weil, Marx could be understood as attempting to bring about a social order that enabled all in it to live, and so to be treated as ends-in-themselves. As such, it had to be a society free from oppression; and so a society in which all could (and did) attend to others, rather than viewing them indifferently, or as facilitating or hindering some personal or sectional interest or goal.

The trouble with Marx was not his failure to see this, it was his failure to understand the ultimate roots of oppression, and so what it would mean to overcome it. Thus, he thought that what we had to do was encourage the productive forces of capitalism so that they broke asunder the chains of labouring necessity; and he thought that the way to do this was to banish private property and so the drive for surplus value extraction.

However, as she saw it, this was not enough, and she pointed out that Marx himself at times seemed clearly to appreciate this. For the roots of the oppression that diminished, even sometimes obliterated, our capacity to attend to the basic needs of others did not lie solely, even mainly, in the fact of private property. She made the point this way:

“In the factory”… [Marx] writes in Capital, “there exists a mechanism independent of the workers, which incorporates them as living cogs… The separation of the spiritual forces that play a part in production from manual labour, and the transformation of the former into power exercised by capital over labour, attain their fulfilment in big industry founded on mechanization. The detail of the individual destiny of the machine-worker fades into insignificance before the science, the tremendous natural forces and the collective labour which are incorporated in the machines as a whole and constitute with them the employer’s power.” Thus the worker’s complete subordination to the undertaking and to those who run it is founded on the factory organization and not on the system of property [emphasis added]. (OL 9-10)

For Weil, the logic of “the factory system” that Marx had pointed to, even as he had missed its importance, was not limited simply to that system. It was, rather, a matter of the division—inherent to any social order above the most rudimentary—between intellectual and physical labour. This division was, at the same time, a division between people, dividing the human world into “two categories of men: those who command and those who obey.” This division undermined the foundations of ethical life because those who commanded could not avoid “reading” those they ordered about as—in the light of their being ordered about—means (or obstacles) to the desired ends. Such power over others as instruments or obstacles did two things to those who wielded it: it “intoxicated” them so that they no longer saw their own vulnerability before the necessities and contingencies of the world (their “ultimate fragility”), nor did they see, because of this intoxicated blindness, the humanity (and so the suffering) of those they lorded it over.

Still, as she saw it at this stage (before her discovery of the “enigma” of affliction), this did not mean that the capacity to attend to, and to care for, the suffering of others demanded “a miracle,” and so was something “supernatural.” What it demanded was, rather, a certain technique of compassion. “Human beings,” she wrote, “are so made that the ones who do the crushing feel nothing; it is the person crushed who feels what is happening.” If, in such a world—that is to say, in our world—ethical life was to find its footing, the challenge was clear: “unless one has placed oneself on the side of the oppressed,” she wrote, unless one “feel[s] with them, one cannot understand.”

4. Affliction, Detachment, the Impersonal, and the Sacred

At this point, for all its elegance and clarity, Weil’s moral philosophy was, ultimately, nothing out of the ordinary. Ethical life presupposed caring for others; and caring for others counted most essentially when others were in need, and so when they were suffering. The moral task was to let it register as it registered in and on the suffering one. It demanded an attentive compassion, understood as “the rarest and purest form of generosity.”

As an intellectual or theoretical stance, all this was unobjectionable, even admirable. However, it could not be simply and completely an intellectual or theoretical stance, for ethical life was also and fundamentally, a practical matter. Marx himself had insisted on this. He said, “the philosophers have only interpreted the world in various ways; the point, however, is to change it.” To change it in an ethical direction and from an ethical stance, however, one had to do more than simply say or think that one understood the oppression, and so the suffering, one sought to identify, alleviate, and eliminate. This was the problem with “the major Bolshevik leaders,” for they pretended “to create a free working class and yet none of them—definitely not Trotsky, and neither I think, Lenin… have… stepped foot into a factory and therefore have the least idea of the real conditions which determine the servitude or freedom of the workers.”

Obligations might be acknowledged, even fought for in revolutionary struggle, but to be truly recognised as the obligations, they had to penetrate. The point was particularly clear with suffering. For to acknowledge suffering as an ethical reality, it was not enough to endorse the description “so and so is suffering,” for that might be done by an entirely disinterested or impartial observer; rather, one needed to be penetrated by that suffering, and, out of the practical necessity involved in that penetration, to do what one could to meet the obligation that suffering imposed.

Here lay the real problem, and one that only came home to Weil when, in an effort to live up to and to live out her ethical vision, she went to work with those she saw at the time as most clearly as of the class of those “who obey”: oppressed, menial, piece-working factory labourers. In this decision and project, she meant to place herself “on the side of the oppressed,” to “feel with them,” and so to understand and to act. Here she would live—and in living, demonstrate—the fundamental penetrative point of the ethical, of obligation, in (and into) the realm of force.

What happened, however, was that she found—in others and in herself—something that seemed to tear the realm of force and the ethical life irretrievably apart: she discovered that suffering that is affliction (malheur, literally “calamitous misfortune”). The suffering “seared the soul.”

It was affliction that turned her moral philosophy away from the conventional and that led her to speak of ethical life in religious terms; and it was affliction that made, or allowed, her to see that what made a human being sacred, what made them the kind of being whose suffering counted, was no ascriptive empirical fact about them, no matter how essential to their “personality,” but was, rather, the impersonal in them.

Affliction was suffering that robbed its bearer of all dignity, both in the eyes of others and in their own eyes. It left them “mutilated,” valueless, worthless. It involved the twinned and catastrophic impact of physical pain (which might be simply the fear of such pain), and social humiliation, social degradation. Affliction, she wrote in a letter to Father Perrin, “takes possession of the soul and marks it through and through with its own particular mark, the mark of slavery,” and it was what she found, in her co-workers and so in herself, as they laboured for Alsthom and Renault. “The affliction of others entered into my flesh and my soul… There I received forever the mark of slavery” (WG 66-67).

What this experience showed her was that her initial political reading of the conditions essential to the morality of attentive caring was ultimately a superficial one: one that did not take morality and its demands on us seriously enough. While there was no doubt that things could be done to reduce the opportunities and occasions for suffering, affliction showed us that human identity, and so the human sense of self dignity and the dignity of others, was inherently fragile, able to be shattered at any time by the unforeseen contingencies of necessity and force that left “the victim writhing on the ground like a half-crushed worm,” “like a butterfly pinned alive into an album.” Unless this terrible and eternal fact had been allowed to penetrate us, even the best-intentioned reforms, even especially those driven by revolutionary righteousness, would produce, in due course, their own half-crushed worms, their own pinned-alive butterflies.

To take morality seriously meant taking affliction seriously, for if suffering mattered at all, it certainly mattered here. It was just at this point, however, where everything was in the balance, that the inadequacy of her previous understanding revealed itself, for with affliction caring attention—being penetrated by the object—was “impossible.” In the essay “The Love of God and Affliction”, she wrote that the afflicted:

…have no words to express what has happened to them. Among the people they meet, even those who have suffered much, those who have never had contact with affliction (properly defined) have no idea what it is. It is something specific, irreducible to any other thing, like sounds we cannot explain at all to a deaf-mute. And those who themselves have been mutilated by affliction are in no state to bring help to anyone at all, and nearly incapable of even desiring to help. (WG 120)

In fact, it was not simply that those who had never experienced affliction could not comprehend it, it was that any normal, “healthy” human being naturally fled from such recognition, from such penetration: “thought flees from affliction as promptly, as irresistibly, as an animal flees death,” and it did so for a like reason—for affliction manifested that force that turns a human being into a thing. It might not do so by killing outright, but—in a way even more shocking—it managed the paradoxical horror of “turn[ing] a human being into a thing while he is still alive.”

To care for the afflicted, to have been penetrated by affliction, and so to have enacted and lived that point where ethical life meets force (and—the same thing—to make real the point where justice meets and condemns slavery), was to love “where there is nothing to love.” This was why “when compassion truly produces itself, it is a miracle more astonishing than walking on water, healing the sick or even the resurrection of the dead.”

To understand the miracle that gave ethical authority power in a world of amoral force and necessity meant understanding what it was “to love human beings in so far as they come to be “read” by themselves and others “as nothing.”

This idea of attending to, of caring for, and so being penetrated by, a suffering that removed from its bearers “everything that makes us human” meant for Weil two things.

First, that what grounded our attention, our love, did not rest on or presuppose any positive (“valuable”) ascriptive fact about a person (for instance, their sense of rights, of freedom, their dignity or demand for respect, even their sense of hope or longing for the good). All these things, as she saw it, were matters merely of our “personality,” and it was our personality that, in affliction, was destroyed and annihilated. If there was to be any moral connection here, what was crucial could not be anything personal and individuating; as it were, something that stood there, able, as Eric O. Springsted put it, to “overcome circumstances, no matter how bad they are.” To the contrary, and as affliction showed us and the intoxication of power blinded us, “We possess nothing in the world—a mere chance can strip us of everything.”

And second, that to be penetrated by such suffering, such affliction, and so to recognise and respond to it, meant losing one’s own “personality,” one’s own individuality (“the power to say ‘I’”), and so to oneself experience the “void” of the living non-existence that is affliction. This was to be “de-created.” It was to accept the death, the absence, of all that made up our personality, and so to all that was particular in us that “attached” us to the world, and so made of it a kind of fantasy world, focally arrayed, and not something independent, impartially available, and so real. She wrote:

The reality of the world is the result of our attachment. It is the reality of the self which we transfer into things. It has nothing to do with independent reality. That is only perceptible through total detachment. Should only one thread remain, there is still attachment. (G&G 14)

Affliction destroyed the “I” of attachment, but it did not destroy or extinguish the possibility of ethical life and so the obligation to attend to such affliction. How could it? The void was real, as the necessity of avoiding, of fleeing, from it, brought home. It followed that the ultimate ground of value in us—the one that survived affliction insofar as it grounded an absolute obligation to meet and alleviate that personality annihilating suffering—was the “impersonal” in us, not the “personal.” In the 1933 essay “Human Personality,” she wrote:

Neither the person nor the human person in him or her is holy to me… Far from it: it is that which is impersonal in a human being. All that is impersonal in humankind is holy, and that alone. (SE 10,13)

Weil found it natural, even necessary, to speak of the impersonal in terms of our “soul,” and so of that which was “holy” in us, that which was “sacred,” and to view the de-creative capacity to attend to the impersonal in terms of “grace.” She found it equally natural, even necessary, to see the paradigm instance of this impersonality and its recognition, in the caring, afflicted, sacrifice of the Christ of the Crucifixion. However, just as often she spoke of the impersonal in terms of truth and (for her an aspect of the same thing) beauty, and it is this way of speaking that is perhaps the most instructive for philosophers, deriving as it does, and in her own unique way, from the philosopher she most valued, Plato.

For Weil, the pursuit of truth and our receptivity to beauty demanded, and so exhibited, the same kind of open, loving attention to the impersonal that was constitutive of the ethical life and its justice bringing gaze. She pointed, as she often did, to mathematical truth to explain the point. “If a child is doing a sum and does it wrong,” she wrote, “the mistake bears the stamp of his personality. [But] if he does the sum exactly right, his personality does not enter into it at all.” Her idea was that any error here would have to be explained in terms of something individual to the child calculator—for obviously a sum, being mistaken, could not explain itself. However, a sum done “exactly right” just was explained, and completely explained, by itself; it is what, by arithmetical necessity, emerged in an act of attention filled with, penetrated by, the relevant numbers and (so) their relationships. Here there was nothing essentially personal, as there was in any mistaken calculation, only the impersonal—and so universal—truth of the sum as revealed in an act of pure attention.

Of course, a sum done rightly possessed a beauty that one done wrongly lacked, and it was here truth and beauty came together. Not only because the perception or awareness of the beautiful demanded just that impersonal attention ethical life demanded, but—and this was the astounding and contradictory, indeed the redeeming aspect of affliction—because that which we selflessly attended to, that which we allowed to penetrate us as it was in itself, and so in all its truth, was, for that very reason, seen and experienced, even in the horrors of affliction, as (also, at the same time, eternally) beautiful. This, for Weil, was just how it was when it came to loving attention.

For Weil the internal tie between truth and beauty and loving attention—the tie that was constitutive, so “eternal,” in ethical life—found expression in the occasional miracles of compassionate awareness we might come across in life. However, we could find it expressed, too, in two works of supreme beauty: Homer’s Iliad, and the Gospels. In the authors of both, as they shaped their texts, we find expressed “the sense of human misery [that] was the precondition for justice and love.” Here was to be found “the incredible bitterness” of detached, sacred, justice as it penetrated into ethical void of the world of force.

In the Iliad, Weil wrote, this bitter justice:

proceeds from tenderness and that spreads over the whole human race, impartial as sunlight. Never does the tone lose its coloring of bitterness; yet never does the bitterness drop into lamentation. Justice and love, which have hardly any place in this study of extremes and of unjust acts of violence, nevertheless bathe the work in their light without ever becoming noticeable themselves, except as a kind of accent. Nothing precious is scorned, whether or not death is its destiny; everyone’s unhappiness is laid bare without dissimulation or disdain; no man is set above or below the condition common to all men; whatever is destroyed is regretted. Victors and vanquished are brought equally near us; under the same head, both are seen as counterparts of the poet, and the listener as well. (25)

Homer, in the Iliad, saw the infinite value and fragility of human life with a loving, “impersonal,” and (so) unsentimental compassion. He was penetrated by all—Greek and Trojan, defeated and momentarily victorious, Achilles and Priam—and, bathed in his impersonal love, fashioned from their lives an object of supreme, eternal, beauty.

5. Uprootedness and the Needs of the Soul

In December 1942, Weil arrived in London from New York, desperate to contribute to the cause of the Free French. In nine months, she would be dead.

In those months, she returned to the political concerns first broached in Oppression and Liberty. She did so reluctantly, and only because her proposal to train and lead a corps of front-line nurses had been rejected (de Gaulle, on reading her proposal, had exclaimed, “but she’s mad!”). Instead she was set to work analysing political documents sent to London from Resistance Committees in France, many of which concerned the reconstruction of France after the hoped-for Allied victory.

Weil’s contributions to this literature—Draft for a Statement of Human Obligation and The Need for Roots: Prelude towards a declaration of duties towards mankind—were never finally completed, but what was completed lets us see how she brought the moral seriousness she had developed and explored in the years since 1934 to those political concerns she had always had. While she may not have sought the task, she embraced it as a necessity. That was because while it was one thing, and a great thing, to have attended to the suffering and affliction of others, much of that suffering was the result of “social force,” and so the obligation to respond to that suffering had to address those forces. After all—as she had acknowledged from the start—morality at any stage beyond the socially rudimentary led inevitably to politics.

The very titles brought out, in a way only implicit in Oppression and Liberty, the untimeliness of her moral and political thought. For she did not begin with rights, nor with the ideal of liberal freedom encapsulated in Hobbes’ remark that a free man “is he that… is not hindered to do what he has a will to.” She built, rather, on the internal ethical connection between need and obligation:

Obligation is concerned with the needs in this world of the souls and bodies of human beings, whoever they may be. For each need there is a corresponding obligation: for each obligation a corresponding need. There is no other kind of obligation, so far as human affairs are concerned. (SE 21)

Needs and obligations were more fundamental than rights of any kind. Indeed, to think rights fundamental to “social conflicts” was itself a grave moral error, for it “inhibit[ed] any possible impulse of charity on both sides.” She continued:

Relying almost exclusively on this notion [“rights”], it becomes impossible to keep one’s eyes on the real problem. If someone tries to browbeat a farmer to sell his eggs at a moderate price, the farmer can say ‘I have the right to keep my eggs if I don’t get a good enough price.’ But if a young girl is being forced into a brothel she will not talk about her rights. In such a situation the word would sound ludicrously inadequate. (SE 21)

For Weil, rights were “middle level” moral concepts. They were not, and could not be, fundamental or “eternal.”

An obligation which goes unrecognised by anybody loses none of the full force of its existence. A right which goes unrecognised by anybody is not worth very much… Rights are always found to be related to certain conditions. Obligations alone remain independent of conditions. They belong to a realm situated above all conditions, because it is situated above this world. (NR 18)

The fundamental political obligation imposed equally on all of us, and just because of our shared humanity, was the obligation, according to our responsibilities and the extent of our power, to work to reduce to the barest minimum “all the privations of soul and body which are liable to destroy or damage the earthly life of any human being whatsoever.”

Her early claim, as de Beauvoir reported it, “that one thing alone mattered in the world today: the Revolution that would feed all the people on earth,” had deepened and ramified through her discovery of affliction. Affliction may have been grounded in our physicality, but it was much more than that. True affliction arose from “an event that grasps a life and uproots it attacks it directly or indirectly in all its parts—social, psychological, physical.”

Thus, to counter affliction it was not enough to propose a politics that met humanity’s bodily needs (food, shelter, warmth, rest, exercise, breathable air, and potable water), though all this was essential and basic; there had, too, to be a politics that met those needs of the soul crushed, violated, and extinguished, in the deracinated degradation of the afflicted. For while it was the “impersonal” in us that was sacred, this sacredness found its sacramental expression in just that concern for the attachments of the “I” that soul-wearing affliction obliterated. If affliction involved the uprooting of life, then countering it politically meant respecting the human need for roots.

“A human being,” Weil wrote, “has roots by virtue of his real, active and natural participation in the life of a community which preserves in living shape certain particular treasures of the past and certain particular expectations for the future.” This meant that the political challenge we faced—insofar as we concerned ourselves with justice, and not merely the demands, challenges, and threats of force—was immense. This was because “in an epoch like ours”—ruled by the worship of money, driven by a false (because force-centred) conception of greatness, and committed to an assertive, individualistic, “rights”-based (mis)conception of justice in the context of the loss of any living sense of “the sacred”—we were all of us uprooted. This is something that Marx and Weber had noted, too, but without understanding it as an ethical, and so a spiritual, sickness.

Weil had, by this time, no faith in revolutionary politics as the path to a more just, more rooted, human world. Indeed, she had come to see the hope, even the pursuit, of revolution as “the opium of the people.” A politics that recognised and so opposed affliction had to be a moral politics, and ultimately therefore a supernatural politics, for it was “only what comes from heaven that can make a real impress on the earth.” What was required—as an ideal, if never, here in the material domain, as a fully achievable actuality—was a politics, so a shared political vision, that embodied and expressed “poignantly tender feelings” for the “beautiful, precious fragile and perishable object” that is a human being.

This, for Weil, was a politics of equality, not the assertive competitive equality of rights (“to place the notion of rights at the centre of social conflicts is to inhibit any possible impulse of charity on both sides”). It was the political equality of the universal, the eternal, mutual community of needs-based human obligations. Equality, she wrote, “consists in a recognition, at once public, general, effective and genuinely expressed in institutions and customs, that the same amount of respect and consideration is due to every human being because this respect is due to the human being as such and is not a matter of degree.”

Such a world, such a political society, was not, nor could it be, a world entirely without force, a world without those who give orders and those who obey. The very point of the ethical life, of justice, was to bring that life, that justice, to the recalcitrant material world of force and power; it was not to annihilate it in its own orgy of affliction producing, because affliction is blind, power.

What mattered was that the division between order and obedience, between intellectual and physical labour, was absolutely minimised, and that the division that remained rested in the real consent of those who, here, obeyed. A clear and instructive instance of such consent was, she felt, to be found in friendship, for friendship was alive and real and meaningful only when “each wished to preserve the faculty of free consent both in himself and in the other.”

Placed on the level of politics, such a demand, Weil insisted, could only ever be answered in and from the contingencies of real political history. However, as a general point, and one deeply relevant to the modern centralising state and its uprooting capitalist economics, what was called for, what was demanded, was just that she had first pointed to in Oppression and Liberty: the cooperative and systematic decentralisation of society in such a way that no human being was deprived of the “relative and mixed goods (home, country, traditions, culture, etc.) which warm and nourish the soul and without which, apart from sanctity, a human life is impossible.”

Such a cooperative and systematic decentralisation would open up the possibility of our becoming rooted in the world, so in place and in history, in a way that linked and balanced particularity and universality, the local and the global.

That possibility, if it were to be real one, depended on our capacity to shape social force in ways that encouraged the conditions of mutual and attentive human respect, and so human self-respect. On one level, that simply meant organising our lives so as to facilitate the mutual and universal provision of our physical needs, but to be completed (and so to comprehend affliction), it had too to meet the needs of the soul. That, for Weil, meant balancing and harmonising what were, considered in themselves, antithetical needs. Indeed, it was just this antithetical character that allowed us to see the essential challenges for any politics of attention. Human beings, as beings free from the annihilating horrors of affliction, needed to organise themselves in such a way that they found an ordered world in which there was also individual freedom, a world in which there was true equality but also (for it was essential to any non-rudimentary social order) hierarchy, a world in which there was both the responsibility of command and necessity for freely provided consensual obedience, a secure world, but one that allowed for a certain level of risk, a world shaped by an absolute and fundamental concern for truth, but also one that allowed for a real freedom of opinion, and a world that had a place for both private and collective property. These antithetical but also complementary needs of the soul constituted the principles and the challenges of political wisdom. Only through their having real effect might we have any hope for a “flowering of fraternity, joy, beauty and happiness.”

6. The Moral Ground

In one crucial sense, Weil had no time for traditional philosophical concerns for a “foundation” or a “ground” of morality and the ethical life. Any such efforts—like Kant’s attempt to ground the absolute obligation to treat people as ends-in-themselves in their “reverence for the [rational] Law,” or Aristotle’s attempt to ground our ethical concerns in the individual’s drive for self-development, or Hume’s attempt to derive ethical life from our “limited sympathies” in the context of more general prudential and utilitarian calculations—did not work and could not work. Any individual-centred account went astray from the start, for moral life was, at its heart, a matter of inter-human attention and care, while any account that, like Hume, viewed the essential inter-human aspect in terms of limited sympathies and local concerns was focally individualistic, and so provided no basis on which the “supernatural” universal mutuality of moral obligation might have arisen.

However, there was another sense in which Weil was concerned to find a ground for morality. For if she could not give an account of how the capacity for selflessly receptive attention to the suffering of others arose in and from the human condition, and so from human nature, then her moral vision would simply hang there, a fantasy interesting, if at all, only for what it revealed of its author’s personality.

Weil’s morality might invoke the supernaturalness of eternally binding human obligation, but it could only do this and avoid fantasy if that supernatural aspect had its origins in human nature, as indeed, Weil thought, it clearly did.

On what natural foundation then, on what natural primitive fact, did the human capacity, such as it was, to attend to the suffering, ultimately the affliction, of other people arise and (to the extent it did) develop? For Weil, the crucial point was that human beings—primitively, and all things being equal—reacted differently to “things” than they did to other human beings, and that this was the case because of a certain basic or fundamental “power” we exercised over each other. As she wrote in her early essay, “The Iliad or The Poem of Force”:

Anybody who is in our vicinity exercises a certain power over us by his very presence, and a power that belongs to him alone, that is, the power of halting, repressing, modifying each movement that our body sketches out. If we step aside for a passer-by on the road, it is not the same thing as stepping aside to avoid a billboard; alone in our rooms we get up, walk about, sit down again quite differently from what we do when we have a visitor. (5)

Consider the case of the passerby; and assume a primitive situation—one where we what we have is simply a passer-by, not (say) someone we already “read” as an enemy, means, or obstacle. When we see the other person, headed towards us and our path, we “hesitate” in a way we do not if we see, instead, a billboard in the way. There is, with the person, but not the billboard, a certain reciprocal power that modifies “each movement our body sketches out.” Here, in this primitive, “impersonal,” but reciprocity recognising reaction of human to human, is found “that interval of hesitation, wherein lies all our consideration for our brothers in humanity.”

For Weil, such impersonal recognition of the human is the primitive ground of that attention that fills the space “between the impulse and the act,” and in doing this makes the other real for us, one with us, and so one of us. It was, indeed, just this hesitation and the capacity for attention it expressed and opened up for further elaboration that embedded in our (inter)relationship that fundamental equality that meant consent was essential to justice between us. And—perhaps even more fundamental—it was an impersonal hesitation before the human that presupposed and acknowledged that which—through the de-creative powers of affliction—could be destroyed and annihilated by the impact of the “empire of force.” This primitive human perception/reaction, this attentive hesitation that recognised our reciprocity and (so) mutuality, expressed the eternal moral fact on which all of obligation arose and rested. For in our hesitation in the face of the passer-by, in their power to halt, repress, and modify each movement “our body sketches out,” lies an implicit recognition: the recognition of the “supernatural” fact that:

…at the bottom of the heart of every human being, from earliest infancy until the tomb, there is something that goes on indomitably expecting, in the teeth of all experience of crimes committed, suffered, and witnessed, that good and not evil will be done to him. It is this above all that is sacred in every human being. (SE 10)

It was here, “beyond space and time,” and as revealed in our primitive natural history, that Justice, that the Good, revealed itself in its eternal purity. It was here that Weil finally brought together her two most influential historical interlocutors, Kant and Plato. For the ground of our duty to treat others always and never merely as means, but ends in themselves, arose, not from “reverence for the (moral) law,” but from our primitive and reciprocal expectation that in the world, and so “in the teeth of all experience of crimes committed, suffered, and witnessed,” “good and not evil” will be done to us. This “indomitable expectation” is where morality enters the world of force and necessity. It is where the supernatural and the natural world make contact in the sacredness of the impersonal obligation to meet human needs.

7. References and Further Reading

a. Primary

  • Waiting on God. tr. Emma Cruwfurd, (Harper & Row, New York, 1973.)
  • Formative Writings: 1929–1941. eds. Dorothy Tuck McFarland and Wilhelmina Van Ness, (University of Massachusetts Press, 1987.)
  • Intimations of Christianity Among the Greeks. tr. Elisabeth Chas Geissbuhler, (Routledge Kegan Paul, London, 1957.)
  • Letter to a Priest. tr. Arthur Wills, (G. P. Putnam’s Sons, New York, 1954.)
  • The Need for Roots. tr. Arthur Wills, (Routledge Classics, London, 2002.)
  • Gravity and Grace. tr. Emma Crawford and Mario van der Ruhr, (Routledge Classics, London, 2002.)
  • The Notebooks of Simone Weil. tr. Arthur Wills, (Routledge, London, 2003.)
  • On Science, Necessity, & The Love of God. tr. Richard Rees, (Oxford University Press, 1968.)
  • Oppression and Liberty. tr. Arthur Wills and John Petrie (Routledge Classics, London, 2001.)
  • The Iliad, or the Poem of Force. tr. Mary McCarthy, Chicago Review 18:2 1965.
  • Simone Weil: First and Last Notebooks. tr. Richard Rees, (Oxford University Press, 1970.)
  • Simone Weil: Lectures on Philosophy. tr. Hugh Price, (Cambridge University Press, 1978.)
  • Simone Weil—Selected Essays: 1934–1943. tr. Richards Rees, (Oxford University Press, 1962.)
  • Simone Weil: Seventy Letters. tr. Richard Rees, (Oxford University Press, 1965.)
  • On the Abolition of All Political Parties. tr. Simon Leys, (Black Inc., Melbourne, 2013.)

b. Biographical

The deep connection between Weil’s thought and life has seen many authors explore her philosophy through her biography. Here are some of those.

  • Cabaud, Jacques, Simone Weil, (Channel Press, New York, 1964.)
  • Fiori, Gabriella, Simone Weil: An Intellectual Biography. tr. Joseph R. Berrigan, (University of Georgia Press, 1989.)
  • Gray, Francine Du Plessix, Simone Weil, (Viking Press, New York, 2001.)
  • McLellan, David, Utopian Pessimist: The Life and Thought of Simone Weil, (New York: Poseidon Press, 1990.)
  • Perrin, J.B. and Thibon, G., Simone Weil as We Knew Her. tr. Emma Craufurd, (Routledge & Kegan Paul, 1953.)
  • Pétrement, Simone (1976) Simone Weil: A Life. tr. Raymond Roenthal, (Pantheon, New York, 1977.)
  • White, George A., ed. (1981). Simone Weil: Interpretations of a Life, University of Massachusetts Press (1981.)
  • Yourgrau, Palle, Simone Weil, Critical Lives Series, (Reaktion Press, London, 2011.)
  • Weil, Sylvie, At Home with André and Simone Weil. tr. Benjamin Ivry, (Northwestern University Press, 2010.)

c. Secondary

  • Allen, Diogenes, Three Outsiders: Pascal, Kierkegaard, Simone Weil, (Wipf and Stock, Eugene, 2006.)
  • Blanchot, Maurice, The Infinite Conversation. tr. Susan Hanson, (University of Minnesota Press, 1993.)
  • Bell, Richard H., Simone Weil, (Rowman & Littlefield,1998.)
  • Chenavier, Robert, Simone Weil: Attention to the Real. tr. Bernard E. Doering. (University of Notre Dame Press, 2012.)
  • Dietz, Mary, Between the Human and the Divine: The Political Thought of Simone Weil, (Rowman & Littlefield, 1988.)
  • Doering, E. Jane, Simone Weil and the Specter of Self-Perpetuating Force. (University of Notre Dame Press, 2010.)
  • Doering, E. Jane, and Eric O. Springsted, eds. The Christian Platonism of Simone Weil, (University of Notre Dame Press, 2004.)
  • Finch, Henry Leroy, Weil and the Intellect of Grace, (Continuum International, New York, 1999.)
  • Irwin, Alexander, Saints of the Impossible: Bataille, Weil, and the Politics of the Sacred, (University of Minnesota Press, 2002.)
  • McCullough, Lissa, The Religious Philosophy of Simone Weil, (I. B. Tauris, London, 2014.)
  • Morgan, Vance G., Weaving the World: Simone Weil on Science, Mathematics, and Love, (University of Notre Dame Press, 2005.)
  • Moulakis, Athansios, Simone Weil and the Politics of Self-Denial. tr. Ruth Hein, (University of Missouri Press, 1998.)
  • Plant, Stephen, Simone Weil: A Brief Introduction, (Orbis Books, 2007).
  • Radzins, Inese Astra, Thinking Nothing: Simone Weil’s Cosmology, (Vanderbilt University, 2005.)
  • Rhees, Rush, Discussions of Simone Weil, (SUNY Press, 2005.)
  • Rozelle-Stone, Rebecca A., and Stone, Lucien, Simone Weil and Theology, (Bloomsbury, New York, 2013.)
  • Springsted, Eric O. (2010) Simone Weil and the Suffering of Love. Wipf and Stock Publishers.
  • Veto, Miklos, The Religious Metaphysics of Simone Weil. tr. Joan Dargan, (State University of New York Press, 1994.)
  • von der Ruhr, Mario, Simone Weil: An Apprenticeship in Attention, (Continuum, London, 2006.)
  • Winch, Peter, Simone Weil: “The Just Balance,” (Cambridge University Press, 1989.)

Author Information

Tony Lynch
Email: alynch@une.edu.au
University of New England
Australia

Haskell Brooks Curry (1900-1982)

Curry photoHaskell Brooks Curry was a mathematical logician who developed a distinct philosophy of mathematics. Most of his work was technical: he was the major developer of combinatory logic, which nowadays plays a role in theoretical computer science. This formalism was originally intended to be a basis for a system of symbolic logic in the usual sense, but the original system turned out to be inconsistent, and the core which was consistent later became a formalism that is a kind of prototype of the computer languages called functional, in which programs are allowed to apply to and change other programs. It is essentially equivalent to the lambda-calculus-calculus) of Alonzo Church. (See the article on λ-calculi in this encyclopedia.)

Curry’s work on combinatory logic led him to a notion of formal system which is different in some respects from the one which has since become standard. In addition, Curry became interested in proof theory, especially the work of Gerhard Gentzen. Curry wanted to use these ideas in his search for a consistent system of logic based on combinatory logic. Curry also did some work on computing in the early days, including work on the ENIAC (one of the first electronic computers) immediately after World War II. Finally, he also became known for a philosophy of mathematics that he called formalism, which he originally considered as denying mathematics as the science of formal systems (in his sense), but which he later extended to include formal methods in general. This idea of formalism is probably better thought of today as a form of structuralism.

Table of Contents

  1. Biography
  2. Combinatory Logic
    1. Beginning Period
    2. The Kleene-Rosser Paradox and its Aftermath
    3. Late Period (after World War II)
  3. Gentzen-style Proof Theory
  4. War Work and Computing
  5. Formalism: the Philosophy of Mathematics
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Haskell Brooks Curry was born on September 12, 1900 at Millis, Massachusetts. His father was Samuel Silas Curry, president of the School of Expression of Boston, Massachusetts. The School of Expression was originally founded by Anna Baright in 1879 as the School of Elocution and Expression. It was renamed in 1885, after Anna Baright married Samuel Silas Curry. It became Curry College in 1943. His mother was Anna Baright, who was Dean of the School of Expression. He graduated from high school in 1916 and entered Harvard University with the intention of going into medicine. During his first year, he took a mathematics course at the suggestion of his advisor and did very well. In the Spring of 1917, the United States entered World War I, and Curry responded by enlisting in the army, becoming a member of the Student Army Training Corps on October 18, 1918. He felt he would never play a direct role in the war if he continued with his pre-medical course, so he changed his major to mathematics with the idea of going into the artillery. The war ended on November 11, 1918, and Curry left the army on December 9, 1918, but he kept on in mathematics, receiving his A. B. degree in 1920.

For the next two years he studied electrical engineering at MIT in a program that involved working half-time at the General Electric Company. Because he was usually interested in why an answer was correct when the engineers seemed interested only in the fact that it was correct, he decided that he would be better off pursuing a degree in pure science, and in 1922 he switched to physics. He returned to Harvard, where for the year 1922–23 he was a half-time research assistant to P. W. Bridgman, who later won the Nobel prize in physics. In 1924 he received his A.M. in physics (from Harvard). But by this time his interests had shifted still further, and he now switched to mathematics. (During this period, both of his parents died, his father dying in 1921 and his mother in 1924.)

He continued to study mathematics at Harvard until 1927, where he was a half-time instructor during the first semester of 1926-27 but otherwise studied full-time. He was also involved in the business affairs of his family, the School of Expression.

During this period, Curry had become interested in logic. Originally, all of his logic was reading on the side, and at one point he was supposed to be working on a dissertation on a topic in differential equations assigned to him by George D. Birkhoff. Furthermore, he was getting advice from various faculty members at Harvard and elsewhere to stay away from logic. This advice was especially strong from Norbert Wiener, who was at MIT and who was a member of the same birdwatching club as Curry. But Curry had become too interested in logic to stop thinking about it. He was especially interested in the first chapter of Principia Mathematica [Russell and Whitehead 1910-1913], which he started reading in 1922 when he was 21 years old, and where a system of propositional logic is defined by means of axioms and two primitive rules. The first one is detachment, which says that from not-p or q and from p to deduce q (this is equivalent to modus ponens, which says that from p \supset q and p to deduce q). The second one is substitution, which says that given any formula, any formula obtained by substituting another formula for a variable can be deduced; for example, if from the formula p \supset p, one can substitute \neg q \vee r for p to get \neg q \vee r \supset \neg q \vee r. Curry noticed when he first saw this that the rule of substitution is much more complicated than detachment in the sense that today we would find it more complicated to implement in a computer language. In 1926-27, as a result of trying to analyze substitution down to its simplest elements, Curry had the idea for using operators which he called combinators, the term we still use today. He used these operators to analyze this rule of substitution, and he concluded that this idea might lead to a dissertation. When he took this idea to several professors, he got a different reaction than he had previously had about staying away from logic. This was especially true of Norbert Wiener at MIT, who said that his opinion had been that logic was a subject to be avoided “unless you had something to say,” and since Curry clearly had something to say, “strength to your right arm!”

However, there was no faculty member at Harvard who could supervise a dissertation on this topic. So Curry decided that it would be useful to teach for a year, and, after getting a recommendation for the position from George D. Birkhoff, assumed an instructorship at Princeton for the year 1927-28. During a library search there he found the paper by Moses Schönfinkel, [Schönfinkel 1924], a report of a talk given at Göttingen in 1920, which had clearly anticipated his ideas. Curry was shocked at this anticipation because he had thought his ideas were completely original, and he ran to the office of Oswald Veblen, who, although primarily a geometer, was interested in the foundations of mathematics and who was also the PhD supervisor of Alonzo Church, to tell him about the anticipation. Veblen calmed Curry down by saying, “Good, I am always glad when somebody has one of my ideas, for it shows that I am on the right track.” To find out more about Schönfinkel, Veblen then took Curry to see the Russian topologist Pavel Alexandroff, who was visiting Princeton that year. Alexandroff reported that Schönfinkel was in a mental hospital and was unlikely to resume his mathematical work, but that at Göttingen were several mathematicians, including Paul Bernays, who were probably betteer paced to discuss these topics. It was thus decided that Curry should go to Germany.

As part of an application for financial support for that trip, Curry wrote his first published paper, [Curry 1929]. Before leaving for Germany, Curry married, on July 3, 1928, Mary Virginia Wheatley of Hurlock, Maryland. (Virginia had been a student at the School for Expression, where they met.) After the wedding, the Currys left for Germany, where they spent the year 1928-29 at Göttingen. During that year, Curry first met the logician Alonzo Church, who was there for half the year.

That year at Göttingen was enough for Curry to complete his dissertation. His referee was David Hilbert, although he actually did most of his work with Paul Bernays, and he was examined on July 24, 1929. At this examination, Hilbert asked Curry a question on another topic (called automorphic functions), which Hilbert assumed that Curry would not know. As it happened, Curry had taken a course on that very subject at Harvard, and Curry was able to give a good answer. Hilbert responded by asking in great surprise, “Wo haben sie das gelernt?” (“Where did you learn that?”) The dissertation was published (in German) as [Curry 1930].

Curry now needed a job, and he took up a position as an Assistant Professor at the Pennsylvania State College (Penn State – Penn State became the Pennsylvania State University in 1953). Eventually, most people who knew Curry came to associate him with Penn State, but when he first went there he did not plan to stay long. He had been at Harvard, Princeton, and Göttingen, and at Penn State he felt cut off from most of his former academic community. Furthermore, in those days, Penn State did not support research. (Later, thanks partly to Curry’s influence, Penn State changed its policy, and it is now a major research institution.) But his arrival there coincided with the beginning of the great depression, and the demand for logicians in the academic world was not very high. So he remained and settled down at Penn State, staying there, with the exception of several leaves of absence, until his retirement in 1966. He progressed normally through the academic ranks, becoming an Associate Professor in 1933 and a full Professor in 1941.

Everybody who knew the Currys was aware of how friendly and helpful they always were. Curry always did more for colleagues and students than be a source of important ideas (although, of course, his ideas have been of tremendous importance). He was always willing to listen to anybody who wanted to talk to him, to discuss their ideas, and to give whatever encouragement he could. His office door was always open. Also well known wherever the Currys lived was the hospitality they both showed. There were always many parties and other, less formal, gatherings. Curry also had a playful sense of humor.

The first of his leaves of absence was a year at the University of Chicago in 1931-32 as a National Research Council Fellow. (The original award was supposed to extend into the following year, but the second year was cancelled for Curry because he had a job to go back to and there were other National Research Council Fellows who did not. It was, after all, the depths of the Great Depression.) In 1938-39, Curry was in residence at the Institute for Advanced Study in Princeton.

Otherwise, Curry spent the 1930s at Penn State teaching and carrying on his research. During this period he was on the reviewing staff of the Zentralblatt für Mathematik und ihre Grenzgebiete (1931-1939). In 1936, he became a founding member of the Association for Symbolic Logic; he was Vice President in 1936-37 and President in 1938-40 as well as being a member of the Council as ex-president during 1942-46.

During this period, the Currys also began their family: Anne Wright Curry (later Mrs. Richard S. Piper) was born on July 27, 1930, and Robert Wheatley Curry followed on July 6, 1934.

By the end of the 1930s, Curry was established as one of the most important mathematical logicians in the United States and, in fact, in the entire world. As such, he was asked to present his views on the nature of mathematics to the International Congress for the Unity of Science held at Cambridge, Massachusetts at the beginning of September 1939. The result was a long manuscript of which he presented a shorter version to the Congress, [Curry 1939]. A series of papers on the philosophy of mathematics began with this paper and continued for the rest of his life.

In the following year, 1940, Curry became a member of the Board of Trustees of Curry College, formerly the School of Expression, the institution of which his father had been president. He remained a member until 1951. Later, on June 5, 1966, the college presented him with the honorary degree of Doctor of Science in Oratory.

When the United States entered World War II, Curry decided to put logic aside for the duration of the war. From 1940 until 1942 he had been a member of the National Committee on War Preparedness of the American Mathematical Society and the Mathematical Association of America. On May 25, 1942, he left Penn State and went to the Frankford Arsenal, where he worked as an applied mathematician until January 1944; then he went to the Applied Physics Laboratory at Johns Hopkins University, where he remained until March, 1945. Next he went to the Ballistic Research Laboratories at the Aberdeen Proving Ground, where he stayed until September, 1946. During his last three months there, he was Chief of the Theory Section of the Computing Laboratory and for one month he was Acting Chief of the Computing Laboratory; it was during this period that he became involved with the ENIAC computer. As a result of this experience he was a consultant in the field of computing methods to the United States Naval Ordinance Laboratory from June 1, 1948 until June 30, 1949.

In September, 1946, Curry returned to Penn State. He wanted to pursue his work on electronic computers, and so he tried to interest the university in acquiring some computing equipment. He was unsuccessful in this. He persisted until a colleague pointed out to him that if he did succeed, he would probably be made head of the program without any increase in salary. He then decided that this colleague was right and gave up the attempt. This effectively limited him from pursuing computing theory.

He was, however, getting back to logic. In Amsterdam in the summer of 1948, during the Tenth International Congress of Philosophy, it was proposed to him that he write a little book of under 100 pages on the subject of combinatory logic for the new North-Holland series in logic. He felt that there was too much unpublished research on the subject to write such a short book, and so he sent them instead his philosophical manuscript from 1939 with a few minor revisions. This appeared as [Curry 1951]. But this idea did suggest to him the project that eventually led to his two volumes with the title Combinatory Logic [Curry and Feys 1958] and [Curry et al.1972]. Feeling that he needed a collaborator, especially one who was better than he was at exposition, he decided to work with Robert Feys, who had published some papers on combinatory logic. Curry thus obtained a Fullbright grant and spent the year 1950-51 at Louvain in Belgium. After his return to Penn State, he and Feys continued their work, and the manuscript of [Curry and Feys 1958] was completed in 1956. The book appeared in 1958, published by North-Holland.

Meanwhile, money finally became available at Penn State for graduate students. Edward J. Cogan first approached Curry before he left for Louvain, and worked with Curry after he returned in 1951, finishing his dissertation in 1955. Kenneth L. Loewen also studied with him during this period, but left to take an academic position elsewhere in 1954 and did not finish his dissertation until 1962.

After the completion of [Curry and Feys 1958] Curry turned his attention to Gentzen-style proof theory. He had done some previous work on this, including a series of lectures delivered at Notre Dame University in Indiana in April, 1948 (which resulted in his book [Curry 1950]), and he felt that it formalized the kind of reasoning used in the development of the part of combinatory logic as a system of logic in the usual sense, and so he felt that it should be settled before he began work on [Curry et al. 1972]. He thus began work on what became his book [Curry 1963]. This work was made easier when, in 1959, he became Evan Pugh Research Professor and was thus relieved of undergraduate teaching duties. The manuscript of [Curry 1963] was completed in 1961.

By this time, there were two more graduate students, Bruce Lercher and Luis E. Sanchis, both of whom completed their dissertations in 1963.

From February to September, 1962, the Currys took a trip around the world, visiting a number of universities where Curry gave lectures.

In 1964, Curry met two new future collaborators. J. Roger Hindley arrived at Penn State for a lectureship which served as something of a postdoctoral position after finishing his dessertation at Newcastle-upon-Tyne, and Jonathan P. Seldin arrived as a beginning graduate student. Curry was just beginning work on [Curry et al. 1972]. Unfortunately, Feys had died in 1961, and Curry, left to work alone, soon realized that he needed collaborators. In 1965, he invited Hindley to join him on the project.

In 1966, Curry retired from Penn State after being there for 37 years. He then went to Amsterdam, where for the next four years he was Professor of Logic, History of Logic, and Philosophy of Science, and also Director of the Instituut voor Grondslagenonderzoek en Philosophie der Exacte Wetenschappen, both at the university of Amsterdam. Seldin went to Amsterdam on a Graduate Fellowship from the United States National Science Foundation, and completed his dissertation in 1968, after which he joined Curry and Hindley as a co-author of the book they were then writing. Curry had one more graduate student in Amsterdam, Martin W. Bunder, who finished his dissertation in 1969.

The manuscript of [Curry et al. 1972], was completed in May, 1970, just before Curry retired from the University of Amsterdam. He returned to State College, Pennsylvania (the town in which Penn State is located), where he continued his mathematical work, writing reviews (especially for Mathematical Reviews) and occasional papers. John A. Lever wrote a master’s thesis with him there in 1977 after obtaining special permission from the university authorities to work under a retired professor. In 1971-72, Curry accepted a visiting position at the University of Pittsburgh. Otherwise, he and Virginia remained at State College, except for some occasional trips, until his death on September 1, 1982. Curry left his papers to the library at Penn State.

Curry’s hobby throughout his life was bird watching, and by the end of his life, Curry had a reputation as an amateur ornithologist.

2. Combinatory Logic

a. Beginning Period

Curry invented combinatory logic independently by analyzing the operation of the substitution of a well-formed formula for a propositional variable in the system of propositional logic of the first chapter of [Russell and Whitehead 1910-1913]. He intended combinatory logic to be a foundation for mathematical logic and perhaps also for all of mathematics. Much of the subject is extremely technical. This will be as non-technical an introduction as it is possible to write.

The basic idea here is that of a function, which is a mathematical operation which does something to an input. Thus, for example, there is the numerical function which squares its argument (i.e., multiplies it by itself). Mathematicians usually write that if f is the squaring function, then for each possible argument (input) x, f(x) = x^{2}. Then, if this function is applied to the number 3, we get f(3) = 3^{2} = 9.

In combinatory logic, the application of a function to an argument, such as f(3), is written (f3) or f3. Also, the need for functions of more than one variable is avoided by allowing the value of a function to be another function. For example, suppose, in traditional notation, f(x,y) = x - y. Then let g(x) = h_x(y) where h_x(y) = x - y. Then f(3,y) = 3 - y = h_3(y). In combinatory notation, (gx)y = x - y and (g3)y = 3 - y. In this notation, we use association to the left for application, so that gxy = (gx)y.

This method of using only functions of one argument has come to be called currying, and the function g of the previous paragraph is often called (curry f). Curry himself learned of this use of his name in his last years, and he protested because he had gotten the idea from Schönfinkel, but this use of Curry’s name has stuck.

Other combinators are:

  1. The identity operator \mathsf{I}, with the property that \mathsf{I} x = x.
  2. The constancy operator \mathsf{K} with the property that \mathsf{K} xy = x. Thus, \mathsf{C} x is a constant function whose value for any argument is x.
  3. The compositor \mathsf{B}, with the property that \mathsf{B} xyz = x(yz). This says that to apply \mathsf{B} xy to z, first apply y to z and then apply x to the result.
  4. The diagonalizer \mathsf{W} with the property that \mathsf{W} xy = xyy.
  5. The distributor \mathsf{S} with the property that \mathsf{S} xyz = xz(yz).

Note that \mathsf{I} can be defined in terms of the other operators, since \mathsf{W} \mathsf{K} x = \mathsf{K} xx = x, so \mathsf{I} = \mathsf{W} \mathsf{K}. Also, since \mathsf{S} \mathsf{K} \mathsf{K} x = \mathsf{K} x(\mathsf{K} x) = x, \mathsf{I} can be defined as \mathsf{S} \mathsf{K} \mathsf{K}.

Now suppose we want to say that an operation, say addition, is commutative (i.e. the order of adding does not matter). The traditional way of writing this in mathematics is x + y = y + x. But this is not a property of x and y; it is a property of +. To say this in the language of combinatory logic, we would write +xy = +yx. Now suppose we have an operator \mathsf{C} (for “commutator” with the property that (\mathsf{C} x)yz = xzy. Then +yx = (\mathsf{C} +)xy, and we can say that + is commutative by writing (\mathsf{C} +) = +. This operator \mathsf{C} is called a combinator.

The defining rules for these combinators have been written above with the equality symbol, which is symmetric. But it is often useful to read these equations only from left to right. Then these equations would be called contractions, so that \mathsf{I} x contracts to x, \mathsf{C} xyz contracts to xzy, \mathsf{K} xy contracts to x, \mathsf{B} xyz contracts to x(yz), \mathsf{W} xy contracts to xyy, and \mathsf{S} xyz contracts to xz(yz). Terms are reduced to other terms by performing sequences of 0 or more contractions on subterms of the original term. For example, the reduction of \mathsf{S} \mathsf{K} \mathsf{K} x to x is as follows:

\mathsf{S} \mathsf{K} \mathsf{K} x \rhd \mathsf{K} x (\mathsf{K} x) \rhd x.

(Here I am using the symbol ‘\rhd‘ to indicate a reduction.) Note that there are some terms which cannot be reduced. These terms are said to be in normal form. On the other hand, some terms can lead to infinite reductions, for example

\mathsf{W} \mathsf{W} \mathsf{W} \rhd \mathsf{W} \mathsf{W} \mathsf{W} \rhd \ldots.

Curry decided to found mathematical logic on a system of combinators whose primitive combinators were \mathsf{B}, \mathsf{C}, \mathsf{K}, and \mathsf{W}. (He did not yet understand the role of \mathsf{S}, which he got from Schönfinkel.) The part of combinatory logic that deals with the basic properties of the combinatory terms without reference to logical connectives and quantifiers is now called pure combinatory logic. He was going to add logical connectives and quantifiers until he had developed a complete system of logic; this part of the subject he called illative combinatory logic. This word “illative” is a word Curry coined himself, based on the Latin word illatum, the past participle of infero, which means “to conclude”.

He proved several important results in this context. First of all he proved that if X is any combination of combinators and the variables x_{1}, x_{2}, \ldots x_{n}, there is a term F in which the variables x_{1}, x_{2}, \ldots x_{n} do not appear such that Fx_{1}x_{2}\ldots x_{n} = X. Curry used the notation [x_{1}, x_{2}, \ldots , x_{n}]X for this F. For example, since \mathsf{S} \mathsf{I} \mathsf{I} x \rhd \mathsf{I} x (\mathsf{I} x) \rhd xx, we can take \mathsf{S} \mathsf{I} \mathsf{I} to be [x]xx. He also gave axioms for the system so that this F was uniquely determined by X and the variables in question. (The existence of such an abstract for every term X and all variables x_{1}, x_{2}, \ldots , x_{n} is called combinatory completeness.) Another of the things he proved early on (in his dissertation) is that the basic system of combinators, without any axioms for any logical connectives or quantifiers, is consistent.

Using the notation of combinators, Curry wrote what is normally written (\forall x)A as \Pi X, where Xx = A. This operator \Pi was present in his dissertation, but none of its properties were developed there. Instead, Curry started writing a series of papers expanding combinatory logic to include not only this universal quantifier \Pi, but also \mathsf{P} (for implication, so that \mathsf{P} XY = X \supset Y, or if X then Y) and equality \mathsf{Q}, so that \mathsf{Q} xy means x = y. In 1934, Curry published [Curry 1934a] giving properties of \mathsf{P} and \mathsf{Q}.

b. The Kleene-Rosser Paradox and its Aftermath

In 1932, Curry learned of a paper by Alonzo Church, [Church 1932]. Church’s system was based on \lambda-abstraction, which forms terms from variables by application and abstraction: if x is a variable and M is a term, then (\lambda x \;.\; M) is a term. (The outermost parentheses may be omitted if no confusion results.) For example, (\lambda x \;.\; x^{2}) is the squaring function, and (\lambda x \;.\; x^{2})3 = 3^{2} = 9. Here, (\lambda x_{1} x_{2} \ldots x_{n} \;.\; M), which is an abbreviation for (\lambda x_{1} \;.\; (\lambda x_{2} \;.\; \ldots (\lambda x_{n} \;.\; M) \ldots )), plays the role of Curry’s [x_{1}, x_{2}, \ldots , x_{n}]X. (For a complete introduction to both \lambda-calculus and combinatory logic, see [Hindley and Seldin 2008]. See also the article on \lambda-calculi in this Encyclopedia.) Also, the variables x in \lambda x \;.\; M is called bound; variables not within the scope of a \lambda are called free.

Reduction for Church’s system is defined by a rule that Curry called (\beta): (\lambda x \;.\; M)N contracts to [N/x]M, which is the result of substituting N for x in M, where other bound variables are changed to avoid capture. In ordinary predicate logic, this sort of change is made by changing (\forall x)(x < y) to (\forall z)(z < y) if a term in which x occurs free is substituted for y.

Note that reduction in Church’s system differs from reduction in combinatory logic in that if M reduces to N, then \lambda x \;.\; M reduces to \lambda x \;.\; N, but in combinatory logic the fact that X reduces to Y does not automatically imply that [x]X reduces to [x]Y, since subterms of X often do not really occur in [x]X.

In 1934, Curry received a letter from Rosser informing him that Kleene and Rosser had proved inconsistent the system of [Church 1932] and the system of [Curry 1934]. They did this by deriving Richard’s Paradox (See the article on Richard’s Paradox in this Encyclopedia.) in both systems.

Church and his students, Kleene and Rosser, then gave up on the idea of building a system of mathematical logic adequate for all of mathematics by basing the system on \lambda-terms. Instead, they took that part of Church’s system involving only \lambda-terms and treated it separately as the \lambda-calculus. (See the article on \lambda-calculi But Curry had a different reaction. He had always considered the possibility that some systems he would propose might be inconsistent, and so he reacted by beginning a careful analysis of the paradox with the idea of finding a way to define a consistent system.

This analysis lasted for several years, and by the time he took a leave of absence from Penn State to do applied mathematics for the U.S. government during World War II, he had developed a plan for research to look for consistent systems. He had already published [Curry 1941], and he had found a much simpler paradox (now known as Curry’s Paradox; see [Curry 1942b]). The plan he had developed was to look at three different kinds of systems, which differed in the logical connectives and quantifiers that were taken as primitive. The kinds of systems will be discussed here in the order Curry gave them in [Curry 1942a].

  1. Systems based on the theory of functionality. This was Curry’s idea, dating back to 1930, that led to type assignment. He wrote \mathsf{F} \alpha \beta for the predicate of functions which take arguments in \alpha with values in \beta, and he intended \mathsf{F} \alpha \beta X to mean (\forall x)(\alpha x \supset \beta (Xx)). Nowadays, the category (or predicate) \mathsf{F} \alpha \beta is considered a type rather than a predicate, and is usually written \alpha \rightarrow \beta.
  2. Systems based on the theory of restricted generality. Curry had noted that most universal quantification is not absolute, but is over some restricted domain. (This seems obvious nowadays, but in the 1930s it ran counter to the generalising tendency of Frege and Russell.) He defined an operator \Xi to stand for this restricted quantification, so that \Xi X Y would stand for (\forall x : X)(Yx), or (\forall x)(Xx \supset Yx) (where here x does not occur free in X or Y).
  3. Systems based on the theory of universal generality. These were systems based on \Pi and \mathsf{P}, where \Pi X meant (\forall x)(Xx) (where x does not occur free in X) and \mathsf{P} XY means X \supset Y.

In 1942, Curry assumed that these kinds of systems increased in strength in the order given above. The paper [Curry 1942a] was really an abstract of future research rather than a report on completed work.

In the late spring of 1942, Curry finally came to understand the combinator \mathsf{S}. Rosser had published a paper on combinatory logic (based on different basic combinators from those Curry used), and he had shown how to define [x]X by induction on the structure of X. When Curry read this paper and translated the results into his own formalism, he realized why Schönfinkel had defined all combinators in terms of \mathsf{K} and \mathsf{S}, and he started to do the same. The use of \mathsf{S} greatly increased the lengths of definitions of [x]X compared with Curry’s original definition, but greatly simplified the algorithm for building them. With computer implementation has come a reversal of values: an algorithm’s speed of action is now valued more than its simplicity or “beauty”.

c. Late Period (after World War II)

After World War II, when Curry returned to Penn State (For details, see the section of the Bibliography section of this article for Curry’s work during World War II.), he slowly got back into logic. He attended the Tenth International Congress of Philosophy in the summer of 1948, and as a result of a proposal made to him there, he decided to write a long work on combinatory logic, which he intended to include everything known on the subject. Feeling he needed a collaborator, he approached Robert Feys at Louvain in Belgium. Curry used a Fulbright which he was awarded for the year 1950-51 to start this work to start, and Curry and Feys continued to work on it after Curry returned to Penn State in 1951. Curry wound up working on this work and a second volume for most of the rest of his life.

The earliest work on this book was on the basic exposition. Curry and Feys completely revised the foundations of combinatory logic, and spent a lot of time explaining Curry’s approach to formal reasoning and formal systems. They then introduced Church’s \lambda-calculus, and gave a new proof and analysis of the Church-Rosser Theorem, which proves pure \lambda-calculus consistent. The book then took up combinatory logic itself, first pure combinatory logic and then illative combinatory logic. The book finishes with two chapters on the theory of fuctionality.

However, Curry soon began to start new research to be included. At first, this included work expanding the theory of functionality. There was always more than one such theory, and different theories depended on which terms could be what we would now call types, but which Curry called F-obs. There is the basic theory of functionality, in which types are formed from atomic types by the operation that forms \mathsf{F} \alpha \beta from \alpha and \beta. (This is equivalent to forming the type \alpha \rightarrow \beta from \alpha and \beta.) This system is easily proved consistent.

Then there is the full free theory of functionality, in which any combinatory term can be a type. Curry thought that this system was consistent, and in 1954 he tried to prove that consistency. He spent over four months at this attempt by trying to prove that if, from a set of typing assumptions \xi_{1} X_{1}, \xi_{2} X_{2}, \ldots , \xi _{n} X_{n} (where X_{1}, X_{2}, \ldots , X_{n} may be any combinatory terms), one can prove \xi X, then the deduction must take a certain specific form. After almost five months, he realized that if \xi X is the conclusion of any deduction in this special form, then the term X is irreducible in some sense. But the sense involved was not the sense of reduction in combinatory logic, but rather the sense of \lambda-calculus. The difference is that in \lambda-calculus, if M \rhd N then \lambda x \;.\; M \rhd \lambda x \;.\; N, which is what one would expect. But in combinatory logic, the fact that X \rhd Y does not automatically imply that [x]X \rhd [x]Y, for subterms of X do not necessarily occur in [x]X.

For Curry, the fact that the term X in the conclusion of a deduction in the theory of functionality must be irreducible in the sense of \lambda-calculus was not very satisfactory. Curry usually thought in combinators rather than \lambda-terms. Thus, he set out to find a reduction among combinatory terms that would be more like \lambda-reduction. He began with \lambda \beta \eta-reduction, which is \lambda-calculus in which the reduction rules include (\alpha), the rule for changes of bound variables, (\beta), the basic reduction for \lambda-calculus, which says that (\lambda x \;.\; M)N \rhd [N/x]M, the result of substituting N for x in M, and (\eta), the rule which says that if x is not free in M, then \lambda x \;.\; Mx \rhd M. He then defined strong reduction for combinatory logic that is equivalent to \lambda \beta \eta-reduction. For technical reasons, he needed to take \mathsf{C} as a primitive combinator instead of defining it as \mathsf{S} \mathsf{K} \mathsf{K} as he had done previously, so now combinatory logic is usually defined by taking the three combinators \mathsf{I}, \mathsf{K}, and \mathsf{S} as primitive combinators.

Curry soon managed to prove that the full free theory of functionality is, in fact, inconsistent. The book [Curry and Feys 1958] ends with a chapter including the proof that the full free theory is inconsistent and also some results that are true that were proved as part of the failed attempt to prove it consistent.

This volume also includes the first published proof of the Normal Form Theorem, which says that every term with a type has a normal form. (A term is said to be in normal form if it cannot be reduced. It is said to have a normal form if it can be reduced to a term in normal form.) This result has become more and more important in various systems of typed \lambda-calculi in the decades since this volume was published.

In the years immediately after the publication of [Curry and Feys 1958], Curry began to work on systems of restricted generality. But he only published a couple of papers on this before he began work on [Curry et al. 1972]. This volume begins with addenda to pure combinatory logic, most of which are highly technical. Curry did try to devise a general framework that would include both combinatory logic and \lambda-calculus by defining what he called C-systems. The idea was to set up a framework that could be used to prove results in illative systems that were based either on \lambda-calculus or on combinatory logic without having to give separate proofs for the two cases. But this attempt was not completely successful, since it was later found that many results still needed one proof for \lambda-calculus and another for combinatory logic.

Curry also extended the definition of illative combinatory logic to include any systems with new atomic constants that have special postulates associated with them, even if these new constants do not represent logical connectives or quantifiers. This allowed him to include systems of combinatory arithmetic. Arithmetic had first been represented by Alonzo Church in combinatory logic and \lambda-calculus by defining natural numbers as iterators: the number n is represented by \lambda f x \;.\; \underbrace{f ( f ( f \ldots (f}_{n} x) \ldots )), which applies f to x n times. But by the 1960s, other ways of representing numbers as combinators or \lambda-terms had appeared. For this reason, Curry suggested representing numbers by taking new atomic constants to represent 0 and the successor function (\sigma) and including a combinator that mapped one of these numbers to the corresponding iterator. With any of these representations, a function can be represented by a combinator or \lambda-term if and only if it is partial recursive, or, equivalently, Turing-computable. (This result was first proved for \lambda-calculus independently by Church, Kleene, and Turing in 1936; see, for example, [Kleene 1936c].)

Curry also considered extensions of the results on the theory of functionality, including the introduction of a new typing operator \mathsf{G} with the rule that from \mathsf{G} \alpha \beta X and \alpha Y follows \beta Y (XY), so that the type of the value of a function may depend on the argument as well as on the type of the argument. The type \mathsf{G} \alpha \beta is the type that is now usually denoted ( \Pi x : \alpha \;.\; \beta x), and is usually called the dependent function type. However, the type was only introduced, and no systems based on it were developed by Curry.

The rest of the book includes material on the theory of restricted generality and universal generality. It was shown that these kinds of systems are essentially equivalent. Systems were proved consistent that are essentially equivalent to first-order systems of logic by defining classes of canonical terms which are supposed to represent propositions and propositional functions. Attempts to find consistent systems in which the assumptions for terms to be canonical were stated as axioms of the logic were made, but most of the systems involved were later proved to be inconsistent. Finally, the theory of functionality was used to define systems of type theory in the traditional sense.

Curry spent the rest of his life continuing this work and other work he had done. The last problem he worked on was an attempt to find a reduction for combinatory terms that is equivalent to \lambda \beta-reduction, \lambda-reduction in which the contraction rules are only (\alpha) and (\beta). As of this writing, this problem is not yet settled. See Seldin’s papers [Seldin 2011] and [Seldin 2017].

3. Gentzen-style Proof Theory

Curry read Gentzen’s work [Gentzen 1934] two years after it appeared, and it did not take him long to realize that the ideas of that paper could be useful in finding a system of logic based on combinatory logic that could be proved consistent.

Gentzen had introduced two new formulations of logical systems: natural deduction systems and sequent calculi (L-systems). Natural deduction systems are covered in the article Deductive-Theoretic Conceptions of Logical Consequence in this encyclopedia. Sequent calculi are equivalent to natural deduction systems and are designed to search for proofs.

The consistency of natural deduction systems for propositional calculus and first-order predicate calculus follows from what is called the normalization theorem (due originally to Prawitz, [Prawitz 1965]). This result is equivalent to a result of Gentzen on sequent calculi: the cut elimination theorem. Curry worked out his own proof of the latter theorem. He also used a version of it to give the first published proof of the normal form theorem for ordinary basic functionality. (A proof by Turing from 1941 was not published until 1980; see [Gandy 1980b].)

Curry became convinced that a system of formal logic is not properly formalized unless there is a sequent calculus for it for which the cut elimination theorem can be proven.

Another feature of Curry’s approach is that he considered these systems as formalizing the elementary metatheory of what he called an elementary formal system. An elementary formal system is one in which there are no rules which discharge assumptions. Curry had such a formal system for combinatory logic. He used the idea that he was formalizing the elementary metatheory of an elementary formal system to justify all the operational rules. This illustrates that Curry was concerned with semantics.

4. War Work and Computing

When Curry first left Penn State to do applied mathematics for the U.S. Government, he began working on the mathematics of aiming a projectile at a moving target, the so-called fire control problem. Curry had studied this kind of mathematics as a student, and so he had little trouble doing this work during World War II.

By 1945, when Curry was at the Aberdeen Proving Ground, there was word that an electronic computer, the ENIAC, was being built for the purpose of calculating firing tables for the artillery. Curry was named to the committee that was being set up to evaluate the ENIAC when it was delivered. This committee first met in July, 1945, and early that month Curry attended a lecture on the ENIAC by Herman Goldstine. The next day, he decided to write a program to calculate the digits of e, the base of the natural logarithms. He finished the program in early 1946, but whether it was ever run is uncertain. Curry later reported that nobody else that he knew at the time who was working on the ENIAC in 1946 could see the point of using a computer for a result assumed to be known.

In 1949, John von Neumann and some colleagues wrote and ran programs to calculate the digits of \pi and e. (See [Reitwiesner 1950a] and Reitwiesner et al. 1950b].) As a result, they discovered that the amateur mathematician William Shanks, who had spent over two decades starting in the middle of the 19th century calculating digits of \pi, and who had calculated to 707 digits, had made a mistake on digit number 528. The people who wrote the program in 1949 seem to have had no idea that Curry wrote such a program just a few years earlier. On the other hand, by 1949 there had been some changes in the ENIAC, and the program Curry wrote in 1945–46 might no longer have been compatible with the ENIAC.

Curry also became involved in writing programs to do inverse interpolation on the ENIAC, programs useful for dealing with firing tables. See [de Mol et al. 2010].

Curry’s work on programming inverse interpolation on the ENIAC led him to develop a theory of programming. Curry’s basic approach was very similar to the approach he had taken two decades earlier in analyzing the process of substitution. He broke programs down into the simplest possible elementary components and then proposed using program composition to put them together again. This approach has been compared to the later development of compilers for user languages. See [Curry 1954].

However, Curry was not able to continue to work on this development because he could not persuade Penn State to buy any computer equipment in the late 1940s.

5. Formalism: the Philosophy of Mathematics

Curry developed a distinctive philosophy of mathematics. His views developed considerably over the course of his career, but he is mostly known for his earlier works on the subject.

Curry’s earliest philosophical work, dating from 1939, proposed to define mathematics as the science of formal systems. But Curry’s approach to formal systems was not quite the same as that of most others in the field.

The usual definition of a formal system begins by defining the formal objects as words on an alphabet of symbols, or, to use the terminology more current in computer science today, strings of characters. But then some of these words are picked out as “well formed formulas” by an inductive definition with the property that each well formed formula has a unique construction from the “atomic formulas”. For example, for the propositional calculus, we are given a possibly infinite set of atomic formulas p_{1}, p_{2}, \ldots , p_{n} , \ldots, and a typical definition of well formed formula goes as follows:

  • Every atomic formula is a well formed formula.
  • If P is a well formed formula, then \neg P is a well formed formula.
  • If P and Q are well formed formulas, then P \wedge Q, P \vee Q, and P \supset Q are well formed formulas.
  • Nothing else is a well formed formula.

If the logical system involved includes quantifiers, then the atomic formulas are themselves defined, and that definition may depend on inductive definitions. For example, if we are defining a formal system for first-order logic, we start with terms, which are built up out of atomic terms and individual variables by using basic functions, and then we have predicates, from which the atomic formulas are obtained by applying them to terms. If the first order system is a system of arithmetic, we start with the atomic term 0 and functions denoted by \prime (as a superscript) and + and \cdot (as infixes), and then terms are defined as follows:

  • Every individual variable is a term.
  • 0 is a term.
  • If t is a term, then so is t^{\prime}. (This is intended to denote the number that is one more than t.)
  • If s and t are terms, then s+t and s\cdot t are terms. (The term s\cdot t is often abbreviated as st.)
  • Nothing else is a term.

Once terms have been defined, atomic formulas are defined as follows:

  • If s and t are terms, then s = t is an atomic formula.

And then the following clause is added to the definition of well formed formula:

  • If x is an individual variable and A is a well formed formula, then (\forall x)A and (\exists x)A are well formed formulas.

Curry noted is that although these definitions of term, atomic formula, and well formed formula say they are about strings of symbols on some alphabet, they do not really depend on that fact. For him, the crucial thing was that each term and well formed formula have a unique construction, whereas any word of three or more letters has more than one construction.(For example, the string abc can be formed in two ways: c can be added to ab on the right, or a can be added to bc on the left.) So while we obviously represent formal objects on a page or on a blackboard as strings of characters, it is not necessary that they actually be such strings. The strings may only be the names for these formal objects. It is only necessary that they are defined inductively so that each one has a unique construction.

Also, formal systems do not need to be systems of logic in the ordinary sense with logical connectives and quantifiers. It is possible to have a simpler formal system. An example Curry gave is what he called the “system of Sams” for natural numbers. (He got this name from the Hungarian word for number, which is szám.) In this system, the formal objects are interpreted as natural numbers. There is one primitive formal object, which I will name “0”. There is one operation, which forms X| from X. The rules for forming the sams are as follows:

  • 0 is a sam.
  • If X is a sam, then so is X|.
  • Nothing else is a sam.

There is one predicate, which forms X = Y from sams X and Y. Thus, the elementary statements are those of the form X = Y, where X and Y are sams. There is one axiom, namely

0 = 0

There is also one rule of inference: From X = Y to deduce X| = Y|. This is a very simple formal system, and it is easy to show that the theorems (provable elementary statements) are those of the form X = X, where X is a sam.

In saying that mathematics is the science of formal systems, Curry was claiming that (pure) mathematics does not really have a subject matter. It was not what he called a contensive topic. (The word contensive is a word Curry coined to express the idea of the German word inhaltlich.) Of course, mathematical statements do have subjects and therefore subject matter, but Curry claimed that the only subject matter any mathematical statements had was other mathematics.

Curry’s attitude towards truth was that truth comes in two kinds:

  1. Truth within a formal system (or within a given theory). This depends on how the system or theory is defined.
  2. The acceptability of a system (or theory) for some purpose. This depends on the purpose, and Curry took this pragmatically.

In his work on combinatory logic and Gentzen proof theory, he preferred to use only constructive logic in the metatheory, this would be accepted by more people than classical logic. (In this, he did not see that most mathematicians were not familiar with constructive mathematics.) On the other hand, he had no trouble accepting classical logic in the mathematics to be used in physics. In a sense, Curry did not really believe in one absolute notion of truth.

On the other hand, once formal systems (or any other kind of theories) are created, they have properties which can be investigated, and hence have objective existence. In this sense, Curry believed in the idea that Karl Popper introduced later of the third world. In fact, Popper presented this idea at a session of the Third International Congress of Logic, Methodology, and Philosophy of Science in Amsterdam in 1967, and as it happened Curry was the chair of the session. (See [Popper 1968].) After Popper’s presentation was over, Curry told his graduate student Jonathan P. Seldin, who was also present, that he thought that Popper had made a big deal out of something that was trivially and obviously true.

Over his career, Curry changed several times the words he used to denote the formal objects of a formal system. In his earliest work on combinatory logic, he called them “entities” (using the German word Etwas as a noun in his dissertation, which was written in German). However, in a discussion with a philosopher (whom he did not name in his later years), he was told that his use of that word implied some philosophical conclusions with which he disagreed. At that point, he decided to use the word “term” instead. It is now common to refer to “combinatory terms” and “\lambda-terms”. However, this caused him a problem when he was dealing with a formal system of logic with quantifiers, since the terms would be what are usually called “formulas”, and there are other formal objects called “terms”. So in the end, he coined his own word by taking the first syllable of the word “object”, and started calling them obs. To some people, the word ob appeared to refer specifically to combinatory logic, but in fact Curry used the word for formal objects of any kind of formal system.

In his later work, Curry extended his definition of formal system to allow for systems whose formal objects are strings of characters. He called such systems syntactical systems, and called his earlier kind of formal systems ob systems.

Also in his later work, Curry also extended his definition of mathematics from saying that mathematics is the science of formal systems to saying that mathematics is the science of formal methods. This definition should be sufficiently broad to include all of mathematics, since if we compare piles of apples and oranges by seeing if there is a one-to-one correspondence between them, we are looking at the forms of the piles rather than the content (apples or oranges).

Curry chose the name “formalism” for his philosophy of mathematics because of David Hilbert. However, Curry’s idea of formalism is very different from the idea of other philosophers of mathematics who call themselves formalists. It is probably better to think of Curry’s formalism as a kind of structuralism.

6. References and Further Reading

a. Primary Sources

  • [Curry 1929] Curry, Haskell B., An analysis of logical substitution”, American Journal of Mathematics 51, 363-384.
    • Curry’s first published paper, written as part of an application for a grant to go to Gottingen.
  • [Curry 1930] Curry, Haskell B., Grundlagen der kombinatorischen Logik”, American Journal of Mathematics 52 (1930) 509-536, 789-834.
    • Curry’s dissertation, written in German at Gottingen in 1928-1929. Republished with a translation into English and an introduction on Curry’s work by Fairouz Kamareddine and Jonathan P. Seldin as Foundations of Combinatory Logic by College Publications, 2016.
  • [Curry 1934a] Curry, Haskell B., Some properties of equality and implication in combinatory logic”, Annals of Mathematics (2) 34, 381-404.
    • This is the paper that gave Kleene and Rosser what they needed to prove inconsistent the systems of Church and Curry.
  • [Curry 1934b] Curry, Haskell B., Functionality in combinatory logic”, Proceedings of the National Academy of Sciences U.S.A., 20, 584-590.
    • An extended abstract of item 1936 below, which Curry had some trouble getting accepted for publication because the approach originally looked strange.
  • [Curry 1936] Curry, Haskell B., First properties of functionality in combinatory logic,” Tohoku Mathematical Journal 41 Part II, 371-401.
    • Curry’s first paper on functionality. He originally wrote it in 1932, but had trouble getting it accepted for publication. The version published in 1936 contains many misprints.
  • [Curry 1939] Curry, Haskell B., Remarks on the definition and nature of mathematics”, Journal of Unified Science 9, 164-169, and reprinted many times since.
    • Curry’s first work on the philosophy of mathematics.
  • [Curry 1941] Curry, Haskell B., The paradox of Kleene and Rosser”, Transactions of the American Mathematical Society, 50, 454-516.
    • Curry’s study of the paradox mentioned in the title.
  • [Curry 1942a] Curry, Haskell B., Some advances in the combinatory theory of quantication”, Proceedings of the National Academy of Sciences U.S.A. 28, 564-569.
    • This is the paper Curry wrote just before his leave of absence from Penn State to do war work in which he set out his plans to try to send consistent systems of logic based on combinatory logic.
  • [Curry 1942b] Curry, Haskell B., The inconsistency of certain formal logics”, Journal of Symbolic Logic 7, 115-117.
  • [Curry 1949] Curry, Haskell B., A simplication of the theory of combinators”, Synthese 7, 391-399.
    • The paper in which Curry published his understanding of the combinator S.
  • [Curry 1950] Curry, Haskell B., A Theory of Formal Deducibility, (Indiana University Press).
    • Curry’s first book on Gentzen-style proof theory.
  • [Curry 1951] Curry, Haskell B., Outlines of a Formalist Philosophy of Mathematics (Amsterdam, North-Holland).
    • This was mostly written in 1939 and is essentially the long manuscript from which the paper of 1939 was prepared as a shorter version.
  • [Curry 1954] Curry, Haskell B., The logic of program composition”, In Applications Scientiques de la Logique Mathematique, Actes du 2e Colloque International de Logique Mathematiques, Paris 25-30 Aout 1952, Institut Henri Poincare, (Paris: Gauthier-Villars and Louvain: Nauwelaerts). Curry’s summary of his theory of programming.
  • [Curry and Feys 1958] Curry, Haskell B. and Feys, Robert, Combinatory Logic, Volume I, (Amsterdam, North-Holland).
    • The first volume of Curry’s great work on combinatory logic.
  • [Curry 1963] Curry, Haskell B., Foundations of Mathematical Logic, (McGraw-Hill, and since reprinted by Dover).
    • Curry’s major work on Gentzen-style proof theory.
  • [Curry et al. 1972] Curry, Haskell B., Hindley, J. Roger, and Seldin, Jonathan P., Combinatory Logic, Volume II, (Amsterdam, North-Holland).
    • The second volume of Curry’s great work on combinatory logic.

b. Secondary Sources (by year)

  • [Russell and Whitehead 1910-1913] Russell, Bertrand and Whitehead, Alfred North, Principia Mathematica, 3 volumes (Cambridge University Press).
    • The first major work on logic that Curry read.
  • [Schönfinkel 1924] Schönfinkel, Moses, Über die Bausteine der mathematischen Logik”, Mathematische Annalen 92, 305-306.
    • A work that Curry first encountered in 1927-28 which, much to his surprise, had anticipated his own idea for combinators. The paper was written by Behman, and was a report on a seminar talk Schonnkel had given at Gottingen in 1920. An English translation has appeared as “On the building blocks of mathematical logic”, in From Frege to Gödel: A Source Book in Mathematical Logic, 1879-1931, edited by Jean van Heijenoort (Harvard University Press, 1967), pp. 355-366.
  • [Hilbert 1925] David Hilbert, Über das Unendliche”, Mathematische Annalen 95 (1925) 161-190.
    • One of the most important papers Hilbert wrote on the foundations of mathematics. Reprinted (in German) in David Hilbert, Hilbertiana: Fünf Aufsätze (Darmstadt: Wissenschaftliche Buchgesellschaft, 1964), pp. 79-108. Translation into English published as “On the infinite” in Jean van Heijenoort (editor), From Frege to Gödel: A Source Book in Mathematical Logic, 1879-1931, (Cambridge, MA and London, England: Harvard University Press 1967), pages 367-392.
  • [Heyting 1930] Heyting, Arend, Die formalen Regeln der intuitionistischen Logik”‘, Sitzungsberichte der Preussischen Akademie der Wissenschaften, Physikalisch-Mathematische Klasse 1930, 42-56.
    • The paper in which Heyting introduced his formal system of intuitionistic logic.
  • [Church 1932] Church, Alonzo, A set of postulates for the foundation of logic”, Annals of Mathematics (2) 33, 346-366.
    • The paper in which Church first introduced abstraction as part of a larger system.
  • [Gentzen 1934] Gentzen, G., Untersuchungen über das logische Schliessen”, Mathematische Zeitschrift 39, 405-431.
    • The paper in which Gentzen introduced his systems of natural deduction and his L-systems (sequent calculi).
  • [Kleene 1935] Kleene, Steven C. and Rosser, J. Barkley, The inconsistency of certain formal logics”, Annals of Mathematics (2) 36, 630-636.
    • The paper in which Kleene and Rosser published their proof of the contradiction in the systems of Church and Curry.
  • [Church and Rosser 1936a] Church, Alonzo and Rosser, J. Barkley, Some properties of conversion”, Transactions of the American Mathematical Society 39, 472-482.
    • The paper in which the Church-Rosser Theorem was first proved for lambda-calculus.
  • [Church 1936b] Church, Alonzo, An undecidable problem in elementary number theory’, American Journal of Mathematics 58, 345-363.
    • The paper in which Church proved that there is a problem in elementary number theory which cannot be decided by an algorithm. The paper includes a statement by Church that a function is partial recursive if and only if it can be represented by a -term, a result that he and Kleene obtained independently about the same time.
  • [Kleene 1936c] Kleene, Steven C., “-denability and recursiveness”, Duke Mathematical Journal 2, 340-353.
    • The paper in which Kleene first proved that a function is partial recursive if and only if it can be represented by a -term, a result he discovered independently at the same time Alonzo Church did. This formed part of the justication of the Church-Turing thesis, that a function is mechanically computable if and only if it is partial recursive if and only if it is Turing computable if and only if it is -denable.
  • [Rosser 1942] Rosser, J. Barkley, New sets of postulates for combinatory logics”, Journal of Symbolic Logic 7, 18-27.
    • Rosser’s paper that enabled Curry to understand the combinator S, although Rosser did not use that combinator.
  • [Reitwiesner 1950a] Reitwiesner, George W., An ENIAC determination of pi and e to more than 2000 decimal places”, Mathematical Tables and Other Aids to Computation, 4, 11-15.
    • A paper on the program run on the ENIAC to calculate digits of and e in 1949-1950. The paper shows no indication of any knowledge of the program Curry wrote to do this for e in 1945-46.
  • [Reitwiesner et al. 1950b] Metropolis, N. C., Reitwiesner, G., and von Neumann, J., Statistical treatment of the values of first 2,000 decimal digits of e and calculated on the ENIAC”, Mathematical Tables and Other Aids to Computation, 4, 109-111.
    • The statistical analysis of the results of the program run on the ENIAC as described by George W. Reitwiesner.
  • [Prawitz 1965] Prawitz, Dag, Natural Deduction: A Proof-Theoretical Study, Almqvist & Wiksell, 1965. Reprinted by Dover in 2006.
    • This was originally Prawitz’ doctoral dissertation, and introduced Prawitz’ ideas of proof reduction and proof normalization.
  • [Popper 1968] Popper, K. R., Epistemology without a knowing subject”, in van Rootselaar, B. and Staal, J. F. (editors), Logic, Methodology and Philosophy of Science III: Proceedings of the Third International Congress for Logic, Methodology and Philosophy of Science, Amsterdam 1967, (Amsterdam: North-Holland), pp. 333{373.
    • This is the paper in which Popper introduced his idea of the third world. The paper had been presented in the first session of the congress (11:15 a.m. to 12:00 noon, with H. B. Curry in the chair) under the title “Epistemology and scientic knowledge”. See the program of the congress on p. 543 of the proceedings.
  • [Hindley and Seldin 1980a] Hindley, J. Roger and Seldin, Jonathan P. (editors), To. H. B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, (Academic Press).
    • A collection of papers related to Curry’s work. Includes a short biography and a complete list of Curry’s publications.
  • [Gandy 1980b] Gandy, R. O., An early proof of normalization by A. M. Turing”, in [1980a], pp. 453{455.
    • This is Turing’s earliest proof of the normal form theorem for typed-calculus with an introduction by Gandy.
  • [Hindley and Seldin 2008] Hindley, J. Roger and Seldin, Jonathan P., Lambda-Calculus and Combinators, An Introduction, (Cambridge University Press).
    • A general introduction to lambda-calculus and combinatory logic.
  • [de Mol et al. 2010] de Mol, Liesbeth, Bullynck, Maarten, and Martin, Carle, “Haskell before Haskell. Curry’s contribution to programming (1946-1950)”, in Ferreira, F., Lowe, B, Mayordomo, E., and Gomes, L.M. (Eds.), Programs, Proofs, Processes, 6th Conference on Computability in Europe, CIE, 2010, Ponta Delgada, Azores, Portugal, June 30-July 4, 2010, Springer Lecture Notes in Computer Science, vol. 6158, pp. 108-117.
    • A paper on Curry’s theory of programming.
  • [Seldin 2011] Seldin, Jonathan P., “The search for a reduction in combinatory logic equivalent to -reduction”, Theoretical Computer Science 412, 4905-4918.
    • A paper describing the attempt to find a reduction in combinatory logic equivalent to -reduction, including a discussion of the technical problems involved.
  • [Seldin 2017] Seldin, Jonathan P., The search for a reduction in combinatory logic equivalent to -reduction, Part II, Theoretical Computer Science 663, 34-58.
    • A paper giving the proofs of the key properties of the proposals given in Seldin 2011.

Author Information

Jonathan P. Seldin
Email: jonathan.seldin@uleth.ca
University of Lethbridge
Canada

Cognitive Phenomenology

Phenomenal states are mental states in which there is something that it is like for their subjects to be in; they are states with a phenomenology. What it is like to be in a mental state is that state´s phenomenal character. There is general agreement among philosophers of mind that the category of mental states includes at least some sensory states. For example, there is something that it is like to taste chocolate, to smell coffee, to feel the wind in one´s hair, to see the blue sky and to feel a pain in one´s toe. Is there also something that it is like to consciously think, to consciously judge and to consciously believe something? Are such cognitive states, when conscious, phenomenal states? Is there a clear distinction between sensory states and cognitive states? Or, can our knowledge, thoughts and beliefs influence our sensory experiences? Is there a cognitive phenomenology?

It is challenging to give a clear characterization of the cognitive phenomenology debate, since different contributors conceive of the debate in different ways. Central for the debate is the question of whether conscious thoughts possess a non-sensory phenomenology. Intuitively, there is something that it is like to consciously think, consciously judge and consciously believe something. However, the debate about cognitive phenomenology is not, strictly speaking, about whether there is something that it is like to consciously think. Rather, the debate concerns the nature of cognitive phenomenology. Is the phenomenology of cognitive states reducible to purely sensory phenomenology? Or, is there an irreducible cognitive phenomenology? A sceptic about cognitive phenomenology claims that conscious cognitive states are non-phenomenal. But, conscious cognitive states may seem to be phenomenal because they are accompanied by sensory states. For instance, when one thinks that ´Paris is a beautiful city`, one´s thought may be expressed in inner-speech and an image of Paris may accompany it. These accompanying sensory states are phenomenal states, and not the thought itself. Contrary to this, the proponent of cognitive phenomenology claims that a conscious cognitive state can have a phenomenology that is irreducible to purely sensory phenomenology.

Other debates have also been placed under the ´cognitive phenomenology’ label. There is an ongoing debate within the philosophy of perception about how cognition influences our sensory experiences. Philosophers tend to agree that, for example, an expert ornithologist´s perceptual experience of a type of bird can differ from that of a novice, even if the viewing conditions for both expert and novice are the same. The expert´s knowledge of birds can influence her experience. However, what philosophers disagree about is how the expert´s knowledge influences her experience, and how her knowledge contributes to what her experience is like.

Table of Contents

  1. Background
    1. Terminological Clarifications
    2. Two Kinds of Mental States
    3. Phenomenal Intentionality
  2. The Nature of Cognitive Phenomenology
    1. Irreducible Cognitive Phenomenology
    2. Proprietary Cognitive Phenomenology
    3. Pure and Impure Cognitive Phenomenology
    4. Attitudinal Phenomenology and Content Phenomenology
    5. General, Particular and Individuative Cognitive Phenomenology
  3. Arguments for Cognitive Phenomenology
    1. Arguments from Examples
    2. Contrast Arguments
    3. The Self-Knowledge Argument
    4. An Argument for Pure Cognitive Phenomenology
    5. Individual Differences
  4. Implications of the Cognitive Phenomenology Debate
  5. References and Further Reading

1. Background

a. Terminological Clarifications

When this article talks about a state being conscious, being conscious should be understood as being phenomenally conscious. A phenomenal state is a mental state that is phenomenally conscious in that there is something that it is like for the subject of that state to be in that state. Phenomenal states are states with phenomenology. What it is like to be in a phenomenal state is that state´s phenomenal character. An example of a phenomenal state is a visual experience of the blueness of the sea. Another example is an auditory experience of the sound of waves. There is something that it is like to have these experiences. There is also something that it is like to simultaneously visually experiencing the blueness of the sea and auditorily experiencing the sound of the waves (Bayne & Chalmers 2003). Our everyday conscious experiences are often complex in that they involve simultaneously thinking, feeling and experiencing within different sensory modalities. Such a complex experience is referred to as an overall phenomenal state.

Examples of sensory mental states are perceptual states, proprioception, bodily feelings and pains. Examples of cognitive states are thoughts, judgments and beliefs. According to some views, emotions and categorical perceptual experiences (such as experiencing something as being a type of bird) should also be categorized as cognitive states, or as partly cognitive and partly sensory states (see Chudnoff 2015a, Montague 2017).

b. Two Kinds of Mental States

Traditionally, it was common to distinguish between two kinds of mental states, namely sensory states and propositional attitudes. Paradigmatic examples of propositional attitudes are cognitive states such as beliefs, desires, thoughts and judgements. Propositional attitudes are intentional states since they are about or represent objects, properties or states of affair. They are states with propositional contents that can be linguistically expressed by using a ´that-clause`. The content of my belief ´that it will rain tomorrow` is ´that it will rain tomorrow`. When I believe ´that it will rain tomorrow` I am having a certain attitude towards that content, namely the attitude of belief. I could have had a different attitude towards the same content, I could for instance desire ´that it will rain tomorrow`.

According to the traditional view, sensory mental states, unlike cognitive states, have qualia. On this view, qualia are seen as phenomenal properties that can be separated from intentional or representational properties. For example, my visual experience of a red rose in front of me is intentional in that it is about or represents ´that there is a red rose in front of me`, but it is also something that it is like for me to experience the red rose. The redness that I experience is a property of my experience, a quale. While conscious sensory states are regarded as phenomenal states with qualia, conscious cognitive states are said to lack qualia. They are seen as non-phenomenal states.

Lately, this traditional view has been challenged. Firstly, proponents of intentionalism argue that when I experience a red rose I experience the redness as a property of the rose itself, and not as a property of my experience of the rose. My experience of the red rose has a phenomenal character, but this phenomenal character is embedded in the intentional content of my experience. Secondly, proponents of cognitive phenomenology challenge the assumption that cognitive states are non-phenomenal states when conscious.

c. Phenomenal Intentionality

In their seminal paper from 2002 ‘The Intentionality of Phenomenology and the Phenomenology of Intentionality’, Horgan and Tienson argue against the traditional view and argue in favour of intentionalism and cognitive phenomenology. They also argue for a view about the relation between the intentional and the phenomenal that has recently gained popularity, Phenomenal intentionalism.

According to intentionalism, all mental states are intentional, including phenomenal states. A mental state is commonly regarded as intentional if it is about or directed towards some objects or states of affairs, and if it has a content.

Phenomenal intentionality is a kind of intentionality that is said to be grounded in phenomenal consciousness (Kriegel 2011, Mendelovici 2018). According to proponents of Phenomenal intentionalism, there is a Phenomenal intentionality and all other forms of intentionality are derived from Phenomenal intentionality. While other proponents of intentionalism hold that intentionality is primary to phenomenology (see for example, Tye 1995 and Dretske 1995), proponents of Phenomenal intentionalism claim that phenomenology or Phenomenal intentionality is primary to all other forms of intentionality (Horgan & Tienson 2002, Kriegel 2011, Mendelovici 2018).

While most proponents of Phenomenal intentionalism also claim that there is a cognitive phenomenology, the two views should not be intermingled. Phenomenal intentionalism is a view about what it is that grounds the relation between phenomenal consciousness and intentionality, while cognitive phenomenology is a view about the scope of phenomenal consciousness. A proponent of cognitive phenomenology needs not accept Phenomenal intentionalism, and it is not necessary for a proponent of Phenomenal intentionalism to hold that there is a cognitive phenomenology. However, since proponents of Phenomenal intentionalism claim that all intentionality is derived from Phenomenal intentionality, it is easier to explain the intentionality of cognitive states if one holds that conscious cognitive states are phenomenal states. If one denies that there is a cognitive phenomenology and accepts Phenomenal intentionalism, one needs to tell a story about how the intentionality of cognitive states is derived from the Phenomenal intentionality of sensory states. While if one holds that there is a cognitive phenomenology one can simply claim that the intentionality of non-conscious cognitive states (such as dispositional beliefs) is derived from the Phenomenal intentionality of conscious cognitive states.

2. The Nature of Cognitive Phenomenology

The debate about whether or not there is a cognitive phenomenology can seem bewildering since there are different claims about what cognitive phenomenology is, and these claims may vary in both strength and generality.

a. Irreducible Cognitive Phenomenology

According to Elijah Chudnoff (2015a), a proponent of cognitive phenomenology should minimally accept the irreducibility thesis.

Irreducibility: ‘Some cognitive states put one in phenomenal states for which no wholly sensory states suffices’ (Chudnoff 2015a: 15).

It follows from Irreducibility that some cognitive states are such that because one is in them one is in a phenomenal state for which no wholly sensory states suffice. That is, there is a phenomenal character that is over and above the phenomenal character that accrues for sensory states. Putting one in a phenomenal state should be understood as a non-causal explanatory relation that can alternatively be picked out by ´in virtue of` or ´constitutively dependent on` (see Chudnoff 2015b).

In order to get a better grip on the Irreducibility thesis we can contrast it with an alternative view on the relation between cognitive states and phenomenal states. It is uncontroversial to claim that cognitive states can make an impact on our sensory states. For instance, judging that the sum of the angles of a triangle is 180 degrees can lead one to visualize the triangle or to express sentences such as ´the sum of the angles of a triangle is 180 degrees` in inner speech. In this case, one is in a phenomenal state since one is in a certain cognitive state, but the phenomenal state one is in is not different from the phenomenal state various wholly sensory states can put one in (Chudnoff 2015a). What Irreducibility claims is that some cognitive states can put one in phenomenal states that are different from those phenomenal states that wholly sensory states can put one in. Chudnoff uses an example from mathematics to illustrate how Irreducibility differs from the view that cognitive states merely cause one to be in a certain phenomenal state. At first you read that ´If a < 1, then 2 – 2a > 0`, and you wonder whether this is true (Chudnoff 2015a: 15). Then you realise how a´s being less than 1 makes 2a smaller than 2 and so 2 – 2a greater than 0. When you realise the truth of this mathematical proposition you might say to yourself in inner speech ´If a < 1, then 2 – 2a > 0` and you might visualize the variable ´a` and the numeral ´1`. You might also feel satisfied because you got it right. These states that you are put in are all sensory phenomenal states. However, if you believe Irreducibility and if you think that this case of realising the truth of this mathematical proposition involves cognitive phenomenology, then you also believe that these sensory states taken together cannot account for the overall phenomenal state you are in. You think that there is some phenomenal state that is left over which only the cognitive states of ´realising` or ´intuiting` can put you in.

Following Chudoff, Irreducibility is the thesis that a proponent of cognitive phenomenology must minimally accept. There are other theses figuring within the cognitive phenomenology debate that go beyond Irreducibility and make stronger and more specific claims about the nature of cognitive phenomenology.

b. Proprietary Cognitive Phenomenology

According to Irreducibility, some sensory states put one in phenomenal states for which no wholly sensory states suffice to put one in. However, it does not follow from Irreducibility that only cognitive states put one in these phenomenal states. Neither does it follow from Irreducibility that the phenomenal character of the phenomenal states that cognitive states put one in is cognitively grounded. That is, that their phenomenal character is different in kind from sensory phenomenal character (Levine 2011).

Many proponents of cognitive phenomenology hold that there is a proprietary cognitive phenomenology (See Horgan & Tienson 2002, Horgan 2011, Kriegel 2011, Kriegel 2015a, Kriegel 2015b, Pitt 2004, Pitt 2011, Siewert 1998, Siewert 2011). The kind of phenomenology that philosophers are talking about when they are talking about cognitive phenomenology must differ in kind form the kind of phenomenology one is familiar with through one´s sensory experiences. As David Pitt puts it:

I believe that the phenomenology of occurrent conscious thought is proprietary: It´s a sui generis sort of phenomenology, as unlike, say, auditory or visual phenomenology as they are unlike each other—a cognitive phenomenology. (Pitt 2011: 141)

There is something that it is like to be in a conscious cognitive state and/or to consciously entertain a cognitive content, and this phenomenology is distinct from the phenomenology one experiences when one is consciously perceiving something or feeling something. Cognitive phenomenology is, on this view, proprietary and sui generis.  

Proprietary: Conscious cognitive states have proprietary or sui generis phenomenal character.

Someone who accepts Proprietary also accepts Irreducibility, but one may accept Irreducibility and deny Proprietary. For example, one could claim that knowing a lot about sparrows may influence the way one visually experiences sparrows so that one can be put in phenomenal states for which no wholly sensory states suffice. One´s knowledge does not merely cause one to attend to sparrows in a particular way. Rather, one´s knowledge puts one in a phenomenal state that one could not have been put in by wholly sensory states. In such a case, cognitive states can make a constitutive contribution to one´s perceptual experience by, for example, structuring the experience, without thereby producing a phenomenal state that is non-sensory in kind (see Levine 2011, Nes 2011). However, most philosophers hold that cognitive states can cause one to be in certain sensory states by influencing attention. Carruthers and Veillet (2011) argue that it is not clear that the sparrow expert´s experience involves irreducible cognitive phenomenology, since it is possible that her knowledge simply causes her to attend to sparrows in a different way compared with a novice. She will notice certain properties of the sparrows that the novice fails to notice, but the phenomenal state she is in is a state that wholly sensory states suffice to put her in. How should we decide between these views?

If cognitive phenomenology is proprietary, it should in principle also be possible to pick it out via introspection. Holding that cognitive phenomenology is proprietary allows one to appeal to introspection in cases where there is a dispute about whether cognitive phenomenology is involved or not. This may serve as a motivation for holding that cognitive phenomenology is proprietary, and not merely irreducible.

c. Pure and Impure Cognitive Phenomenology

We can further distinguish between three different ways of characterizing the nature of a phenomenal state: 1) A phenomenal state is purely sensory in case wholly sensory states suffice to put one in that state; 2) A phenomenal state can be partly cognitive (and partly sensory) if no wholly sensory states suffice to put one in that state and no wholly cognitive states suffice to put one in that state; 3) A phenomenal state is purely cognitive in case cognitive states suffice to put one in that state (Chudnoff 2015b). A cognitive phenomenal state is an impure cognitive phenomenal state if 2 holds but not 3. A cognitive phenomenal state is a pure cognitive phenomenal state if 3 holds. In other words, a cognitive phenomenal state is a pure cognitive phenomenal state if it is independent of sensory states.

A proponent of cognitive phenomenology needs not accept that there is pure cognitive phenomenology. It is compatible with Irreducibility that there is merely impure cognitive phenomenology. Many of the cases that are commonly appealed to in arguments for cognitive phenomenology seem to involve impure cognitive phenomenology. For instance, the overall phenomenal state one is in when one suddenly grasps a mathematical proposition arguably depends on both sensory experiences and intuiting. Proposed candidates for pure cognitive phenomenology are imageless thoughts and beliefs.

It is compatible with Irreducibility to deny that there is pure cognitive phenomenology. However, if one holds Proprietary one seems committed to accept that pure cognitive phenomenology is, at least, possible. Following Proprietary, cognitive phenomenology is different in kind from other kinds of phenomenology, and it should in principle be possible to pick out this kind of phenomenology via introspection. When one is in a phenomenal state that involves different sensory modalities—such as the state one is in when watching a TV-show—one seems able, at least roughly, to pick out and separate visual phenomenology from auditory phenomenology. This is because visual phenomenology is quite unlike auditory phenomenology. Similarly, when one is consciously thinking that p, one should be able to separate the phenomenology of thinking from the auditory phenomenology involved when expressing the content in inner-speech. On this view, cognitive phenomenology is a sui generis kind of phenomenology, as unlike auditory and visual phenomenology as they are unlike each other (Pitt 2004, Pitt 2011).

d. Attitudinal Phenomenology and Content Phenomenology

Cognitive states such as thoughts, beliefs, judgements and inferences are propositional attitudes. One may think that conscious cognitive states have attitudinal cognitive phenomenology PA:

PA: There is something that it is like to have a conscious cognitive attitude towards a content, and no wholly sensory states suffice to put one in a state with this phenomenal character. 

PA is compatible with Irreducibility and Proprietary.

The claim that there is a cognitive phenomenology can also be a claim about the cognitive content that one is consciously entertaining when one is in a cognitive state. One may think that conscious cognitive states have content cognitive phenomenology CA:

CA: There is something that it is like to consciously entertain a cognitive content, and no wholly sensory states suffice to put one in a state with this phenomenal character.

A proponent of cognitive phenomenology can accept that there is an attitudinal cognitive phenomenology and deny that there is a content cognitive phenomenology. One can also hold that there is a content cognitive phenomenology, but not an attitudinal cognitive phenomenology. Or, one can accept that there is both an attitudinal cognitive phenomenology and a content cognitive phenomenology.

e. General, Particular and Individuative Cognitive Phenomenology

Cognitive phenomenology claims can be general claims such as the claim that conscious cognitive attitudes have attitudinal cognitive phenomenology, where this attitudinal cognitive phenomenology is common for all cognitive attitudes. Alternatively, cognitive phenomenology claims can be claims about there being a particular cognitive phenomenology involved when one is consciously believing, and this attitudinal cognitive phenomenology is different from the attitudinal cognitive phenomenology involved when one is having other conscious cognitive attitudes. One may also think of attitudinal cognitive phenomenology as even more fine-grained: for example, that there are different attitudinal cognitive phenomenologies involved in having different kinds of conscious beliefs.

The claim that there is a content phenomenology can be more or less general. The most general claim is that there is a content cognitive phenomenology that is common for all cognitive contents. A more particular view claims that the cognitive content phenomenology involved in consciously entertaining the content that p, say, differs from the cognitive content phenomenology involved in consciously entertaining that q. An even more particular view holds that the content cognitive phenomenology involved in consciously entertaining the content that p is different from the content phenomenology involved in consciously entertaining any other cognitive contents. Further, one could hold that the phenomenology involved in consciously entertaining the cognitive content that p may differ from person to person. For example, the content phenomenology involved when John consciously entertains the cognitive content that p, differs from the content phenomenology involved when Jane consciously entertains the cognitive content that p.

Particular claims about either attitudinal cognitive phenomenology and content cognitive phenomenology are often motivated by the view that phenomenology is individuative. That is, in virtue of having the phenomenal character it has, my belief is a belief as opposed to a judgment, a thought or an intuition. And, in virtue of having the phenomenal character it has, the content that I am entertaining, the content that p, is the very content that p as opposed to the content that q. By claiming that phenomenology is individuative one can elegantly explain how one can determine the content of one´s own phenomenal state. One knows which phenomenal state one is in, and its content, because it has the phenomenal character that it has. For instance, when I am having a visual experience of a red rose I come to know—via introspection—that I am having a visual experience of a red rose. Similarly, I come to know that I am consciously believing that p due to the phenomenal character belief that p has (Pitt, 2004, Horgan 2011, Kriegel 2011, Kriegel 2013).

3. Arguments for Cognitive Phenomenology

We can distinguish between different types of arguments for cognitive phenomenology. These arguments are generally arguments for Irreducibility, but some of them also defend stronger claims about the nature of cognitive phenomenology. This section presents the types of arguments that are most commonly used and common responses to them.

a. Arguments from Examples

Arguments from examples appeal to cases or circumstances where one seems to be in phenomenal states that involve cognitive phenomenology. For instance, there is something that it is like for me to suddenly remember that I have an appointment with a student in 5 minutes. The state that I am in when I suddenly remember something is a cognitive state. There can be sensory states involved as well; a visual image of my student may pop-up, or I may feel annoyed because I almost forgot about the appointment. The cognitive state I am in when I suddenly remember my appointment puts me in a phenomenal state, and no wholly sensory states suffice to put me in that state.

Another argument from example appeals to tip-of-the-tongue experiences, the kind of experiences one has when searching for a word that one knows but fails to retrieve (Goldman 1993). There is something that it Is like to have such experiences, and cognitive states play a role in putting one in that state, and no wholly sensory states suffice to put one in that state.

A sceptic about cognitive phenomenology may agree with the proponent of cognitive phenomenology in that the states that these arguments appeal to are phenomenal states, while denying that they are cognitive phenomenal states. According to the sceptic there is always some sensory states involved when one suddenly remembers something. When I remember that I have an appointment with my student in 5 minutes, I may visualize my student and feel annoyed by myself for almost forgetting about the appointment. The sensory states that I am in can, according to the sceptic, fully account for the phenomenal character of the state that I am in.

One can make a similar response to the tip-of-the-tongue example. When having a tip-of-the-tongue experience I am making an effort to retrieve a word, and it is the sensory feeling of making an effort that accounts for the phenomenal character of the experience.

A proponent of cognitive phenomenology can insist that if one carefully introspects one´s phenomenal states, it becomes apparent to one that these states involve cognitive phenomenology. However, such appeals to introspection are problematic because a sceptic may simply claim that she is carefully introspecting the phenomenal state she is in when she suddenly remembers something, but she finds only sensory phenomenology. Nevertheless, it seems wrong to completely dismiss appeals to introspection, as some such appeals appear more convincing than others. Charles Siewert (1999) argues that the sensory states involved in cases where one suddenly remembers something occur after the state of suddenly remembering. The state that one is in when suddenly remembering something needs not involve any sensory phenomenology at all. Following Siewert, the state of suddenly remembering is a pure cognitive phenomenal state (Siewert 1999).

b. Contrast Arguments

One of the most commonly used type of argument for cognitive phenomenology is contrast arguments. Contrast arguments for cognitive phenomenology appeal to two contrasting phenomenal states, s1 and s2, where there appears to be a difference in the phenomenal character of s1 and s2, and where this difference is best explained as a difference in cognitive phenomenology. Contrast arguments can be used when arguing for attitudinal cognitive phenomenology, content cognitive phenomenology, pure and impure cognitive phenomenology. The expert/novice argument that is introduced earlier in this article can be seen as a contrast argument.

When contrast arguments are used as argument for attitudinal cognitive phenomenology one typically appeals to cases where there is a slight change in one´s attitude towards a content. An example is the change of attitude one experiences when one suddenly grasps a mathematical proof. There is something that it is like to grasp a mathematical proof, and the state one is in when one suddenly grasps it differs from the state one was in before grasping it.

When contrast arguments are used as arguments for content phenomenology one typically appeals to a pair of situations where one is attending to the meaning of an ambiguous utterance in natural language, and where there appears to be a phenomenal difference in the states one is in depending on which proposition one takes the utterance to express (Horgan & Tienson 2002).

Contrast arguments can be more or less convincing, depending on how easy it is to give an alternative explanation of the contrast, and on whether the claim that there is a contrast is convincing.

´The foreign language argument’, due to Galen Strawson (1994), is maybe the most famous contrast argument for cognitive phenomenology: Jack is a native English speaker who does not understand French, while Jacques is a native French speaker. Both Jack and Jacques hear the same instance of the utterance ´La vie est belle`. There is something that it is like for both Jack and Jacques to hear the utterance, though what it is like for Jacques differs from what it is like for Jack. So, Jack and Jacques are put in different phenomenal states. The difference in the phenomenal character of their states can be explained by the fact that Jacques, unlike Jack, understands what is being said. Jacques, unlike Jack, has an attitude of understanding towards the content, and he is able to consciously entertain the content that is being expressed. In the case of Jacques, unlike Jack, cognitive states of understanding and entertaining a content put him in a phenomenal state, and this explains why the phenomenal state he is in differs from the phenomenal state Jack is in. In order to make the foreign language argument into an argument for cognitive phenomenology one needs to add that the phenomenal difference between Jack`s and Jacques` states is a difference in cognitive phenomenology.

However, in this case, at least some of the differences between the two phenomenal states involve differences in sensory phenomenology. From phonetic studies, we know that a sentence expressed in a language sounds different for a person who understands that language, compared to what it sounds like for a person who does not understand the language (Pinker 1995). This difference is at least partly auditory. The person who understands the language attends differently to the phonemes and prosody of the utterance compared with the person who does not understand the language. A sceptic about cognitive phenomenology may therefore agree that there is a phenomenal difference between the states that Jack and Jacques are in, but claim that the difference is a difference in purely sensory phenomenology (Lormand 1996). The proponent of cognitive phenomenology may insist that though the phenomenal states of Jack and Jacques also differ in sensory phenomenology, the differences in sensory phenomenology do not sufficiently explain the whole phenomenal difference.

A different type of contrast argument that appeals to ambiguous utterances in a familiar language has been proposed by, among others, Kriegel 2011, Horgan 2011, Horgan & Tienson 2002 and Siewert 1999. For example: it is something that it is like to hear the ambiguous utterance ´I am going to the bank` where one understands this utterance as being about the financial institution, as opposed to what it is like to hear the same instance of the utterance and understand it as being about the river bank. One is in different phenomenal states depending on which proposition one consciously entertains. Arguably, given that one accepts that there is a phenomenal difference between these states, this difference is best explained as a difference in cognitive phenomenology.

In this case, the argument is appealing to the same instance of utterance in a language that one does understand. A sceptic who agrees that there is phenomenal difference between the two states may possibly claim that the different understandings cause one to be in different sensory states, and that the phenomenal difference is due to this. However, it is less easy, compared with the foreign language argument, to see what candidates for such states would be. Surely, hearing the utterance and understanding it as ´I am going to the financial institution` may cause some emotional responses in someone who has financial problems, but it needs not have such an effect. Apparently, one needs not respond emotionally to either of the two understandings of the utterance. Also, one may, but one needs not visualize the financial institution or the river bank when hearing the utterance. Arguably, one´s sensory states can remain the same, regardless of which of the two understandings one consciously entertains, and still there is a phenomenal difference. Therefore, if there is a phenomenal difference in this contrast case, the most plausible candidate for explaining the difference is that there is a difference in cognitive phenomenology. One is put in different phenomenal states, and no wholly sensory states suffice to put one in these phenomenal states. Contrast arguments involving ambiguous utterances of this type have the virtue that if there is a phenomenal contrast in these cases, this contrast is difficult to explain away as a contrast in sensory phenomenology. One way of responding to such contrast arguments is to deny that there is a phenomenal contrast. That is, one is not in different phenomenal states in such cases.

c. The Self-Knowledge Argument

The self-knowledge argument that was originally presented by David Pitt (2004) is a very complex argument, and this article presents only a rough version of it.

The argument from self-knowledge differs from the types of arguments introduced above in that it explicitly supports a strong cognitive phenomenology claim: the claim that there is a proprietary, distinctive and individuative cognitive phenomenology. According to the argument, we can have immediate knowledge of the content of our own conscious thoughts, and the only way we can explain how such knowledge is possible is by assuming that there is a proprietary, distinctive and individuative cognitive phenomenology of thought. From this it follows that one is able to consciously do three distinct things: a) to distinguish one´s occurrent conscious thoughts from one´s other occurrent conscious mental states (cognitive phenomenology is proprietary); b) to distinguish one´s occurrent conscious thoughts from each other (cognitive phenomenology is distinctive); c) to identify each of one´s occurrent conscious thoughts as the thought it is (cognitive phenomenology is individuative).

According to the self-knowledge argument (Pitt 2004):

P1: It is possible immediately to identify one´s occurrent conscious thoughts: one can know by acquaintance (via introspection) which thought a particular occurrent thought is: but

P2: It would not be possible immediately to identify one´s conscious thought unless each type of conscious thought had a proprietary, distinctive, individuative phenomenology, so

C: Each type of conscious thought—each state of consciously thinking that p, for all thinkable contents p—has a proprietary, distinctive, individuative phenomenology.

The argument is valid. Before questioning the premises, we should say something about what it is that motivates them.

Intuitively, one does know the content of one´s conscious thoughts, and one has a privileged introspective access to one´s own thoughts that other people lack. I know when I am thinking ´that pizza is good`, and I know that the mental state I am in is a thought and not a perceptual state. So, I am able to identify my thought as a thought, and I am able to identify the content of my thought and distinguish it from other thoughts.

However, according to the premises of the argument, it is possible to ´immediately` identify one´s occurrent conscious thoughts (P1). This premise relies on a particular view on introspection of phenomenal states—the acquaintance theory—that is controversial. On this view, introspection makes one directly or immediately aware of one´s phenomenal states and their contents. No inferences are made and no causal processes are involved. If one holds a different view on introspection one can simply deny P1 and the argument for self-knowledge. In his article, Pitt strongly defends the acquaintance theory of introspection. For further reading consult Pitt 2004 and Pitt 2011.

d. An Argument for Pure Cognitive Phenomenology

Contrast arguments and arguments from examples are generally neutral when it comes to whether they are arguments for pure or impure cognitive phenomenology.

However, Kriegel´s cognitive zombie argument is an argument for pure cognitive phenomenology (see Kriegel 2015b and Chudnoff 2015b). A philosophical zombie is a being that acts and talks like a phenomenally conscious being, but who completely lacks phenomenal states. In other words, there is nothing that it is like to be a zombie (see Chalmers 1996).

Imagine a partial zombie, Zoe, who is an expert mathematician. Zoe is also a sensory zombie, in that there is nothing that it is like for her to have sensory experiences. Still, there is something that it is like for her to gain new mathematical insights. Since Zoe is a sensory zombie the phenomenal states she is in when gaining new mathematical insights are purely cognitive phenomenal states.

A sceptic may respond to this thought experiment by claiming that since Zoe is a sensory zombie, there is nothing that it is like for her to gain these insights. One may insist that cognitive states do not suffice to put one in the phenomenal states that one is normally put in when one grasps something or gains a new insight. The sceptic can either claim that the phenomenology involved in being in such phenomenal states is purely sensory, or she could hold that it is impurely cognitive phenomenal.

In order to strengthen the appeal of this thought experiment, one can turn it into a contrast-argument. Imagine that Zoe turns into a full zombie. As a full zombie, there is nothing that it is like for her to gain mathematical insights. Intuitively, there is a phenomenal contrast between the states of sensory zombie Zoe, and the states of full zombie Zoe. While there is something that it is like for the sensory zombie Zoe to gain mathematical insights, there is nothing that it is like for the full zombie Zoe to do so. If we share the intuition that there is such a contrast between the two zombies, we should also accept that pure cognitive phenomenology is possible.

Interestingly, the cognitive zombie argument appears as more challenging for proponents of impure cognitive phenomenology who deny that there is pure cognitive phenomenology, than for a sceptic who denies that there is cognitive phenomenology. Sensory states within different sensory modalities can put one in certain phenomenal states. We can imagine a zombie who lacks sensory phenomenology in all sensory modalities apart from audition. Intuitively, since she has auditory phenomenal states there is something that it is like for her to watch a movie though her experience is clearly not as rich as that of an ordinary person. Similarly, even if it is normally the case that the phenomenal state one is in when grasping a mathematical proof is a phenomenal state that both sensory and cognitive states puts one in, still there is something that it is like for Zoe the sensory zombie to grasp mathematical proofs. Though Zoe´s phenomenal states may not be as rich as that of a normal person. (For further reading, consult Kriegel 2015b and Chudnoff 2015b.)

e. Individual Differences

Philosophers of mind generally agree that conscious sensory states have phenomenal characters. We come to know what it is like to be in a certain conscious sensory state simply by being in that state. But, when it comes to irreducible cognitive phenomenology, philosophers strongly disagree about whether it exists or not. Why do they disagree?

Maybe the reason why philosophers disagree so strongly is that people simply differ? That is, some people have cognitive phenomenal states, while others do not (see Schwitzgebel 2008)? If this is the case, it can explain why highly competent philosophers on both sides of the debate come to different conclusions when introspecting their own conscious states. However, most philosophers seem to dismiss this possibility. What are the reasons for thinking that people differ so greatly in their phenomenal states? Why are there no similar controversies when it comes to disputes about sensory phenomenology?

4. Implications of the Cognitive Phenomenology Debate

What are the implications of the cognitive phenomenology debate? Why should we care about cognitive phenomenology?

One issue that arises from the cognitive phenomenology debate concerns the trustworthiness of introspection. If there is a cognitive phenomenology, then the opponents have overlooked a range of phenomenal states that they enjoy. On the other hand, if there is no cognitive phenomenology, the proponents have been positing a range of phenomenal states that they do not enjoy (Bayne & Montague 2011). Such considerations may lead us to question the reliability of introspection (Schwitzgebel 2008).

The cognitive phenomenology debate also has implications for the general debate about consciousness, since there are certain theories of consciousness that are at odds with the existence of cognitive phenomenology. For example, accounts that identify phenomenal states with intentional states with non-conceptual contents (see Tye 1995). Such views are not compatible with thoughts having a distinctive phenomenal character, since the content of a thought is conceptual.

Further, the cognitive phenomenology debate has implications for our view on the relationship between phenomenology and intentionality. Proponents of phenomenal intentionalism take phenomenology to be the source of intentionality (Kriegel 2013, Mendelvici 2018). Most proponents of phenomenal intentionalism hold that there is a cognitive phenomenology. If phenomenology is the source of intentionality, cognitive phenomenology is the source of the intentionality of cognitive states. If there is no cognitive phenomenology, the proponents of phenomenal intentionalism need to tell a different story of how phenomenology can be the source of the intentionality of cognitive states.

The cognitive phenomenology debate also has implications for the debate about whether consciousness can be naturalized. If only sensory states are phenomenal states, naturalizing cognition is part of what Chalmers (1996) labels ´the easy problem of consciousness`, while naturalizing conscious sensory states is part of ´the hard problem of consciousness`. The easy problems of consciousness are those that can be solved (in the future) by using the standard methods of cognitive science. Whereas the hard problem is that of explaining phenomenal consciousness (see “The Hard Problem of Consciousness”). If there is a cognitive phenomenology, the hard problem of consciousness becomes more expansive as it will include both sensory and cognitive phenomenal states. Arguably, therefore, if there is a cognitive phenomenology, naturalizing consciousness becomes harder. However, the hard problem remains ´hard` whether we accept that there is a cognitive phenomenology or not. If arguments convince us that there is a cognitive phenomenology, we should accept these independently of the fact that it has the consequence of expanding the hard problem.

5. References and Further Reading

  • Bayne, T & Chalmers, J. L. 2003. “What is the Unity of Consciousness”. In Cleeremans, A (ed.) The Unity of Consciousness. Oxford University Press.
  • Bayne, T. 2009. “Perception and the Reach of Phenomenal Content.” Philosophical Quarterly 59 (235): 385-404.
  • Bayne, T and Montague, M. 2011. “Cognitive Phenomenology: An Introduction”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
  • Carruthers, P and Veillet, B. 2011. “The Case against Cognitive Phenomenology”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
  • Chalmers, D. 1996. The Conscious Mind. Oxfords University Press.
  • Chudnoff, E. 2015a. Cognitive Phenomenology. Routledge.
  • Chudnoff. E. 2015b. “Phenomenal Contrast Arguments for Cognitive Phenomenology.” Philosophy and Phenomenological Research 90 (2): 82-104.
  • Dretske, F. 1995. Naturalizing the Mind. MIT Press.
  • Goldman, A. 1993. “Consciousness, Folk Psychology, and Cognitive Science.” Consciousness and Cognition 2 (4):364-382.
  • Horgan, T. 2011. “From agentive phenomenology to Cognitive Phenomenology: A guide for the perplexed”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
  • Horgan, T and Graham, G. 2012. “Phenomenal Intentionality and Content determinacy”. In Richard Schantz (ed.) Prospects of Meaning. De Gruyter.
  • Horgan, T and Tienson, J L. 2002. “The Intentionality of Phenomenology and the Phenomenology of Intentionality”. In Chalmers, D (ed.) Philosophy of Mind: Classical and Contemporary readings. Oxford University Press.
  • Kriegel, U. 2011. The Sources of Intentionality. Oxford University Press.
  • Kriegel, U. 2013. “The Phenomenal Intentionality Research Program”. In Kriegel, U (eg.) Phenomenal Intentionality. Oxford University Press.
  • Kriegel, U. 2015. “The Character of Cognitive Phenomenology” In Breyer, T and Gutland, C (eds.) Phenomenology of Thinking. Routledge.
  • Kriegel, U. 2015. The Varieties of Consciousness. Oxford University Press.
  • Levine, J.2011. “On the Phenomenology of Thoughts” In Bayne & Montague (eds.) Cognitive Phenomenology. Oxford University Press.
  • Lormand, E. 1996. “Nonphenomenal Consciousness” Nous 30(2): 242-261.
  • Mendelovici, A. 2018. The Phenomenal Basis of Intentionality. Oxford University Press
  • Montague, M. 2017. “Perception and Cognitive Phenomenology” Philosophical Studies 174: 2045-2062.
  • Nes, A. 2011. “Thematic Unity in the Phenomenology of Thinking” Philosophical Quarterly 62: 84 -105.
  • Pitt, D. 2004. “The Phenomenology of Cognition, or What it is Like to Think That P?” Philosophy and Phenomenological Research 69(1): 1-36.
  • Pitt, D. 2011. “Introspection, Phenomenality, and the Availability of Intentional Content”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press
  • Prinz, J. 2011. “The Sensory Basis of Cognitive Phenomenology”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
  • Schwitzgebel, E. 2008. “The unreliability of naïve introspection” The Philosophical Review 117 (2): 245-273.
  • Siegel, S. 2010. The Contents of Visual Experience. Oxford University Press.
  • Siewert, C. 1998. The Significance of Consciousness. Princeton University Press.
  • Siewert, C. 2011. “Phenomenal Thought”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
  • Smithies, D. 2013a. “The Significance of Cognitive Phenomenology” Philosophy Compass 8(8): 731-743.
  • Smithies, D. 2013b. “The Nature of Cognitive Phenomenology” Philosophy Compass 8(8): 744-754.
  • Spener, M. 2011. “Disagreement about Cognitive Phenomenology.” In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
  • Strawson, G. 1994 Mental Reality. MIT Press.
  • Strawson, G. 2011. “Cognitive Phenomenology: Real life” In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
  • Tye, M. 1995. Ten Problems of Consciousness: A Representational Theory of the Phenomenal Mind. MIT Press.
  • Tye, M and Briggs, W. 2011. “Is there a Phenomenology of Thought?” In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.

Author Information

Mette Kristine Hansen
Email: Mette.Hansen@uib.no
University of Bergen
Norway

Sigmund Freud: Religion

This article explores attempts by Sigmund Freud (1850-1939) to provide a naturalistic account of religion enhanced by insights and theoretical constructs derived from the discipline of psychoanalysis which he had pioneered. Freud was an Austrian neurologist and psychologist who is widely regarded as the father of psychoanalysis, which is both a psychological theory and therapeutic system. As a theory, psychoanalysis conceptualizes the mind as a system composed of three constituent elements: id, ego, and superego. It focuses on the interaction between those elements, and includes such key concepts as infantile sexuality, repression, latency and transference. Psychoanalytic therapy is an application of this conceptual schema, in which the interaction of the mind’s conscious and unconscious elements in individual cases is explored using the techniques of dream interpretation, free association and the analysis of resistance to identify repressed conflicts and bring them into the conscious mind.

Freud’s thought on religion is, perhaps fittingly, rather complex and ambivalent: while there can be little doubt as to its roundly skeptical, and at times hostile, character, it is nonetheless clear that he had a firm grounding in Jewish religious thought and that the religious impulse held a life-long fascination for him. This article charts the evolution of his views on religion from Totem and Taboo (1913), through The Future of an Illusion (1927) and Civilization and its Discontents (1930) to Moses and Monotheism (1939), focusing in particular on the parallels drawn by him between religious belief and neurosis, and on his account of the role which the father complex plays in the genesis of religious belief. The article concludes with a review of some of the main critical responses which the Freudian account has elicited.

Table of Contents

  1. Psychoanalysis and Religion
  2. Freud’s Jewish Heritage
  3. Philosophical Connections
  4. The Orientation of Freud’s Approach to Religion
  5. Totemism and the Father Complex
  6. Religion and Civilization
  7. The Moses Narrative: The Origins of Judaic Monotheism
  8. Critical Responses
    1. The Anthropological Critique
    2. Myth or Science?
    3. Lamarckian vs. Darwinian Evolutionary Principles
    4. The Primordial Religion: Polytheism or Monotheism?
    5. Religion as a Social Phenomenon
    6. The Projection Theory of Religion
    7. Moses and Monotheism: Interpretive Approaches
  9. References and Further Reading
    1. References
    2. Further Reading

1. Psychoanalysis and Religion

 At the heart of Freud’s psychoanalysis is his theory of infantile sexuality, which represents individual psychological human development as a progression through a number of stages in which the libidinal drives are directed towards particular pleasure-release loci, from the oral to the anal to the phallic and, after a latency period, in maturity to the genital. He thus saw the psychosexual development of every individual as consisting essentially of a movement through a series of conflicts which are resolved by the internalization, through the operation of the superego, of control mechanisms derived originally from an authoritative, usually parental, source. In infancy, such a progression entails a process whereby parental control involves the introduction to the child of behavioral prohibitions and limitations and necessitates the repression, displacement or sublimation of the libidinal drives.

Central to this account is the idea that neuroses, which may include the formation of psychosomatic symptoms in the individual, arise essentially either out of external trauma or through a failure to effect a resolution of the internal conflict between libidinal urges and the key psychological control mechanisms. Symptomatically, these often present as compulsive and debilitating patterns of behavior—as in hysteria, repetitive ceremonial movements or an obsession with personal hygiene—which make a normal healthy life impossible, requiring psychotherapeutic intervention in the form of such techniques as dream analysis and free association. Of particular importance, he held, is the resolution of the Oedipus complex, which arises at the phallic stage, in which the male child forms a sexual attachment with the mother and comes to view the father as a hated and feared sexual rival. That resolution, which Freud saw as essential to the formation of sexuality, entails the repression of the drive away from the mother as libidinal object and the male child’s identification with the father. The cluster of associations relating to the multifaceted relationship between son and father Freud termed “the father complex” (1957, 144) and, as we shall see, viewed it as central to a correct understanding both of the developmental psychology of human beings and to many of the central and most important social phenomena in human life, including religious belief and practice.

In his account of religion Freud deployed what Paul Ricoeur (1913—2005) terms a hermeneutic “of suspicion” (Ricoeur 1970, 32), a reductive and demystifying style of interpretation that repudiated what he saw as a masquerade of conventional meanings operating at the level of common discourse in favor of deeper, less conventional truths relating to human psychology. He sought to demonstrate by this means the true origins and significance of religion in human life, in effect utilizing the techniques of psychotherapy to achieve that goal. Freud’s general position on religion stands firmly in the naturalistic tradition of projectionism stretching from Xenophanes (c.570—c.475 B.C.E.) and Lucretius (c.99—c.55 B.C.E.) through Thomas Hobbes (1588—1679) and David Hume (1711—76) to Ludwig Feuerbach (1804—1872) in holding that the concept of God is essentially the product of an unconscious anthropomorphic construct, which Freud saw as a function of the underlying father complex operating in social groups. “The psycho-analysis of individual human beings,” he thus stated boldly in Totem and Taboo, “teaches us with quite special insistence that the god of each of them is formed in the likeness of his father, that his personal relation to God depends on his relation to his father in the flesh and oscillates and changes along with that relation, and that at bottom God is nothing other than an exalted father” (Freud 2001, 171).

The following sections examine the considerations which led him to this view, to the manner in which it found articulation in his writings on religion and to the main criticisms which it has encountered.

2. Freud’s Jewish Heritage

 Freud was born to Jewish parents in the town of Freiberg, then in the Austro-Hungarian Empire. His father Jacob was a businessman descended from a long line of rabbinical scholars; a textile merchant, he went bankrupt when Sigmund was four years of age and the family were forced to move to Vienna, where they lived in genteel poverty for many years, dependent in part upon the generosity of relatives. The young Sigmund found it difficult to come to terms with the new urban surroundings and family’s reduced financial circumstances. Experience of the latter left him with a life-long fear of poverty, his overweening ambition to establish psychoanalysis as a new science and successful treatment for hysteria was as a result partially motivated by the desire to achieve financial security for his family.

In the preface to the Hebrew edition of Totem and Taboo, published in 1930, Freud described himself as being “in his essential nature a Jew and who has no desire to alter that nature,” but one who is “completely estranged from the religion of his fathers—as well as from every other religion” (Freud 2001 Preface, xiii). This phrasing marks Freud’s recognition that, notwithstanding his skepticism regarding religion, his character had largely been formed by a Judaic cultural heritage passed on to him by his father Jacob, with whom he had a rather fraught relationship. Freud’s ancestors were affiliates of Hasidic Judaism going back many generations, and included several rabbis and distinguished scholars among their number (Berke 2015, xii). While Jacob was liberal and progressive in his outlook, he retained a deep reverence for the Talmud and the Torah and had overseen Sigmund’s childhood study of the Philippson family Bible, which generated in the young Sigmund a life-long fascination with the story of Moses and his connection with Egypt. He also ensured that the boy had a traditional Jewish schooling in which he was steeped in Biblical studies in the original Hebrew. In that connection the young Freud developed a deep admiration for, and friendship with, one of his religion teachers, Rabbi Samuel Hammerschlag, who was a strong proponent of humanistic Reform Judaism. Such was his admiration for his teacher that Freud ultimately named his fifth and sixth children, Sophie and Anna, after Hammerschlag’s niece and daughter; commentators now generally agree that the patient referred to as ‘Irma’ in Freud’s pivotal The Interpretation of Dreams was in fact Anna Hammerschlag. It was Rabbi Hammerschlag’s deep humanism, more than any other feature of his character, which Freud found inspiring, inculcating in him a lasting commitment to the universality of Enlightenment values. It is notable that, in seeking to pay Hammerschlag the highest compliment possible in the obituary which he wrote for him in 1904, Freud compared him to the Hebrew prophets, but also highlighted the extent to which that aspect of his character was integrated with humanistic ideals: “A part from the same fire which animated the great Jewish seers and prophets burned in him … but the passionate side of his nature was happily tempered by the ideal of humanism of our classical German period, which governed him and his method of education” (Freud 1976 IX, 256).

Notwithstanding the positive impact of such religious influences, from adolescence onwards Freud apparently found the observances and strictures required by orthodox Jewish belief increasingly burdensome and he became overtly hostile to the religion of his forefathers and to religion in general (Goodnick 1992, 352); it is likely that this was the principal cause of the estrangement between Sigmund and his father Jacob. That the estrangement ran deep and was a source of distress to Jacob became evident on the occasion of his son’s 35th birthday, when, in a gesture conforming with an established Jewish custom, he presented Sigmund with the family Bible which he had studied so closely as a child, newly rebound in leather. This was accompanied by a richly lyrical dedication in Hebrew, written in the style of melitzah, a literary tradition of Biblical allusion (Alter 1988, 23), referencing the relationship between them and their shared Jewish heritage. In part, the verse ran:

Son who is dear to me, Shelomoh. In the seventh in the days of the years of your life the Spirit of the Lord began to move you and spoke within you: Go, read in my Book that I have written and there will burst open for you the wellsprings of understanding, knowledge, and wisdom… For the day on which your years were filled to five and thirty I have put upon it a cover of new skin and have called it: “Spring up, O well, sing ye unto it!” And I have presented it to you as a memorial and as a reminder of love from your father, who loves you with everlasting love. (trans. and cited by Yerushalmi 1993, 71)

This attempt at effecting a rapprochement, which gently sought to remind Freud of his father’s love for him and of their shared religious and cultural heritage—implying, as one commentator puts it, “that their Bible embodies both the Jewish tradition and this love” (Gresser 1994, 31)—appeared initially not to have been successful. Freud never mentioned his father’s birthday dedication in his writings, though it was found after his death perfectly preserved in the Philippson Bible with which he had been presented, and his reductive critique of institutional religion became instead ever more sustained and pointed. Yet, at the deepest level, an ambivalence remained; as Freud acknowledged in his Autobiographical Study, “My deep engrossment in the Bible story (almost as soon as I had learnt the art of reading) had, as I recognised much later, an enduring effect upon the direction of my interest” (Freud 1959, XX 8).

The death of Jacob on 23rd October 1896 was one of the most important events in Sigmund Freud’s life and precipitated a lengthy period of reflective contemplation on their relationship. As he confessed later that year in a letter to his friend Wilhelm Fliess, “… the old man’s death has affected me deeply. I valued him highly, understood him very well, and with his peculiar mixture of deep wisdom and fantastic light-heartedness he had a significant effect on my life… in my inner self the whole past has been awakened by this event. I now feel quite uprooted” (Freud 1986, 202). The importance of the event cannot be overestimated; Jacob’s death triggered a period of sustained self-analysis in which Freud had what he considered an epiphany: the hostility which he had often felt towards his father, which had at one point made him suspect that Jacob had been guilty of sexually abusing him, was due to the fact that as a child he saw Jacob as a rival for his mother’s love. Thus was born the ideas of the Oedipus complex to which we have referred above, which, universalized by Freud, became one of the cornerstones of psychoanalytic theory. In his 1908 preface to the second edition of The Interpretation of Dreams, the work which made his reputation globally and brought him the financial security which he had craved, Freud made clear the extent to which his articulation of the new science owed to his analytical resolution of the crisis generated by Jacob’s death: “It was a portion of my own self-analysis, my reaction to my father’s death—that is to say, to the most important event, the most poignant loss of a man’s life” (Freud 2010, xxvi). Still awaiting resolution at that point, however, was the conflict generated in Freud’s life by the demand to find a means of affirming the richness and particularity of his Jewish cultural heritage, as his father had urged in his dedication, without acceding to the Biblical and theological orthodoxies associated with it. A number of scholars (Rice, 1990; Gresser, 1994) have suggested that this problem is one of the keys to an understanding of his final work, Moses and Monotheism.

3. Philosophical Connections

Two of the major formative influences upon Freud were those of the philosophers/psychologists Franz Brentano (1838—1917) and Theodor Lipps (1851—1914). Brentano was author of the seminal Psychology From an Empirical Standpoint (1973, orig. 1874); Freud took two philosophy courses under his direction when he first enrolled at the University of Vienna, as part of which he encountered Feuerbach’s writings on religion. Freud was captivated by the scope and clarity of Brentano’s lectures and found the latter’s emphasis on the need for empirical methods in psychology and for philosophy to be informed by logical rigour and scientific findings highly congenial. Less congenial to him, perhaps, were Brentano’s rational theism and his dismissal of the notion of unconscious mental states; these were two key issues on which Freud was subsequently to diverge sharply from him.

Freud—like other gifted students of Brentano such as Edmund Husserl (1859—1938) and Alexius Meinong (1853—1920)—was enthralled by him as a teacher and scholar, describing him in correspondence as “a darned clever fellow, a genius” (in Boehlich (ed.) 1992, 95). Such was the impact of Brentano’s influence that, at one stage, Freud resolved to take his doctorate in philosophy and zoology, a proposal towards which Brentano was favourably disposed but which faculty regulations at the University prevented from being realised.

In seeking to modernise psychology, Brentano had returned to the Aristotelian definition of the subject, understanding it as “the science which studies the properties and laws of the soul, which we discover within ourselves directly by means of inner perception, and which we infer, by analogy, to exist in others” (Brentano 1973, 5). In that connection, he revitalised the famous principle of intentionality from scholasticism as the defining criterion of mental phenomena and processes: unlike the physical counterparts from which they must be distinguished, mental or psychical phenomena, he argued, are necessarily directed towards intentional objects. Further, since such phenomena are accessible to us directly by means of “inner perception,” their existence and nature comes, he argued, guaranteed with an epistemic certainty and transparency that is markedly lacking in relation to our perception of physical phenomena, where, for example, we sometimes misapprehend such subjective characteristics as colour and taste as objective properties of things.

Given this distinction between the physical and the mental, Brentano considered that one of the key problems for an empirical psychology was that of constructing an adequate picture of the internal dynamics of the mind from an analysis of the complex interplay between diverse mental phenomena, on the one hand, and the interactions between the mind and the external world, on the other. This conception was to have a profound influence upon the development of Freudian psychoanalysis, into which it was to become prominently incorporated. However, Brentano set his face implacably against admitting the notion of unconscious mental states and processes into a fully scientific psychology. In this he was in part motivated by his conviction that all mental states are known directly in introspection or “inner perception” and are thus, by definition, conscious; mental acts, he considered, are pellucid in the sense that they take themselves as secondary objects and so are consciously apprehended as they occur. Further, the positing of the existence of unconscious mental states also seemed to him to introduce uncertainty and vagueness into the field of psychology and to carry with it an implication of the impossibility of the very rigorous, empirically-based science of mind which he sought to establish.

While Freud adopted Brentano’s characterisation of the intentional nature of mental phenomena throughout his work, he did not, of course, accept that all such phenomena are conscious, and indeed extended the very notion of intentionality, in the guise of symbolic meaning, to the level of the unconscious. For the primary focus of Freud’s interest was medical and his therapeutic practice was, from the outset, predicated upon the assumption of a level of scientific understanding of aberrant behaviour and abnormal mental states. And it seemed evident to him from an early stage that the restriction of psychology to the level of conscious processes and events had made, and would continue to make, such a goal unattainable, and that it was precisely because traditional psychology had operated with that restriction that it found such occurrences problematic and inexplicable. Thus, while both Brentano and Freud were motivated by the desire to create a fully scientific science of mind, they reached diametrically opposed positions on the question of the inclusion of the unconscious in its terms of reference. In contrast with Brentano’s belief that the very notion of the unconscious lacks intellectual validity, Freud was convinced that a scientific approach to the area of the mental requires the concept of the unconscious as a critical presupposition.

Freud found strong support for this conviction in Theodor Lipps, a thinker who was as committed as Brentano to the ideal of an empirically grounded psychology governed by an experimental methodology, but who, unlike Brentano, considered that this necessitated, at a fundamental level, reference to the unconscious. Lipps’ account of the nature of the unconscious was of particular importance to the development of Freud’s thought for two reasons: In the first instance, when Freud encountered Lipps’ view that consciousness is an “organ” which mediates the inner reality of unconscious mental processes, he found in it a theory which was almost identical to one at which he had independently arrived. Secondly, in his account of humor—which also anticipated much of Freud’s later work on that subject—Lipps had extended the notion of aesthetic empathy (Einfühlung; “in-feeling” or “feeling-into”) from Robert Vischer (1847—1933) into the psychological realm to designate the process that allows us to comprehend and respond to the mental lives of others by putting ourselves in their place, which involved the key notion that meaningful interaction between humans necessitates the projection of mental states and occurrences from the self to others.

Freud adopted and integrated Lipps’ account of projection centrally in his psychoanalytic theory, regarding it as a precondition for establishing the relationship between patient and analyst which alone makes the interpretation of unconscious processes possible. But perhaps of even greater consequence in connection with the analysis of religion is the fact that concomitant to the idea of psychological projection is the notion that the human need to ascribe psychological states to others can and does readily lead to situations in which such ascriptions are extended beyond their legitimate boundaries in the human realm. As David Hume had observed, “There is an universal tendency among mankind to conceive all beings like themselves, and to transfer to every object those qualities with which they are familiarly acquainted, and of which they are intimately conscious” (Hume 1956, Section 111). It is in that way that personifications or anthropomorphisms arise: human beings, particularly at the early stage of their development, have an innate tendency to go beyond the legitimate boundaries of application of the psychological concept-range and thus to misapply human-being concepts. A child relates to its environment at large most readily through such a process: in the narratives provided by story­books, school text-books and film and televisual animation, the child’s interest, attention, and above all, its understanding, are engaged through the attribution of anthropomorphic qualities to non-human objects and organisms: bees worry, trees are sad, ants are curious, and so on.

In his Essence of Christianity (1841; English trans. 1881), Ludwig Feuerbach had offered a sustained critique of religion predicated upon the notion that the very idea of God is such an anthropomorphic construct, with no reality beyond the human mind, and that specific characteristics attributed to God in religion (Love, Benevolence, Power, Knowledge, and so forth) embody an idealized conception of human nature and of the values esteemed by human beings. This projectionist view, which he first encountered under Brentano’s—no doubt, critical—tutelage, was one which Freud came to accept implicitly and indeed to extend, holding that the insights offered by psychoanalysis into the workings of the human mind can explain just why and how religious anthropomorphisms arise. Freud accordingly integrated his account of religion into the broader project of psychoanalysis, suggesting that “a large portion of the mythological conception of the world which reaches far into the most modern religions is nothing but psychology projected into the outer world… We venture to explain in this way the myths of paradise and the fall of man, of God, of good and evil, of immortality and the like—that is, to transform metaphysics into meta-psychology” (Freud 1914, 309. Italics in original).

4. The Orientation of Freud’s Approach to Religion

In articulating this project, Freud drew deeply upon a wide variety of anthropological sources, particularly the work of such contemporary luminaries as John Ferguson McLennan (1827—1881), Edward Burnett Tylor (1832—1917), John Lubbock (1834—1913), Andrew Lang (1844—1912), James George Frazer (1854—1941) and Robert Ranulph Marett (1866—1943) on the connection between social structures and primitive religions. Freud’s claim to originality in this context resides in his attempt to situate projectionism within the framework of psychoanalysis, ultimately interpreting the social origins and cultural significance of the religious impulse in terms paralleling his account of the father-son relationship in individual psychology.

The evolutionist paradigm, which projected a universal linear cultural development from the primitive to the civilized, with the differences found in human societies reflecting stages in that development, gradually came to function as a background assumption in Freud’s thought from an early stage. Tylor, whose Primitive Culture (1871) and Anthropology (1881) are generally regarded as foundational to the then emergent science of cultural anthropology, held that, in terms of human interaction with the world at large, civilization progresses through three developmental “stages,” from magic through religion to science, with contemporary Western culture representative of the final stage. This view was rearticulated by Frazer in his famous Golden Bough and referenced approvingly by Freud (2001, 90), though he emphasized that elements of the first two stages continue to operate in contemporary life. Accordingly, Freud gradually adopted the position of one who seeks to explicate the significance of religion in the context of a cultural milieu in which, having supplanted attempts to control the world through sympathetic magic, it has itself been superseded by science. Furthermore, Freud found in Tylor’s and Frazer’s evolutionist account of cultural progress an implication which had been affirmed explicitly by Feuerbach: “Religion is the childlike condition of humanity” (Feuerbach 1881, 13); it belongs to a social developmental stage paralleling that of the individual, through which each civilization must pass en route to the maturity of scientific understanding. It was perhaps this latter, more than any other factor, which was to suggest to Freud that the psychoanalytical techniques which he pioneered in his account of individual psychology could be applied socially, to explain the nature of the religious impulse in human life generally.

5. Totemism and the Father Complex

Some of Freud’s earliest comments on religion give immediate evidence of the psychologically reductionist direction which his thought was to take, which represented the dynamic underpinning religion as deriving from the powerfully ambivalent relationship between the child and his apparently omnipotent father. For example, in his 1907 paper “Obsessive Actions and Religious Practices” he drew attention to similarities between neurotic behavior and religious rituals, suggesting that the formation of a religion has, as its “pathological counterpart,” obsessional neurosis, such that it might be appropriate to describe neurosis “as an individual religiosity and religion as a universal obsessional neurosis” (Freud 1976 S.E. IX, 125-6), a view which he was to retain for the remainder of his life.

Freud’s first sustained treatment of religion in these terms occurs in his 1913 Totem and Taboo, in the context of his account, heavily influenced in particular by the work of James George Frazer, Andrew Lang and J.J. Atkinson, of the relationship between totemism and the incest prohibition in primitive social groupings. The prominence and strength of the incest taboo was of considerable interest to him as a psychologist, not least because he saw it as one of the keys to an understanding of human culture and as deeply linked to the concepts of infantile sexuality, Oedipal desire, repression and sublimation which play such a key role in psychoanalytic theory. In tribal groups the incest taboo was usually associated with the totem animal with which the group identified and after which it was named. This identification led to a ban on the killing or the consumption of the flesh of the totem animal and on other restrictions on the range of permissible behaviors and, in particular, it led to the practice of exogamy, the prohibition of sexual relations between members of the totem group.

Such prohibitions, Freud believed, are extremely important as they constitute the origins of human morality, and he offered a reconstruction of the genesis of totem religions in human culture in terms which are at once forensically psychoanalytical and rather egregiously speculative. The primal social state of our pre-human ancestors, he argued, closely following J.J. Atkinson’s account in his Primal Law, was that of a patriarchal “horde” in which a single male jealously maintained sexual hegemony over all of the females in the group, prohibiting his sons and other male rivals from engaging in sexual congress with them. In this account, the psycho-sexual dynamic operating within the group led to the violent rebellion of the sons, their murder of the father and their consumption of his flesh (Atkinson 1903, chapters I-III; Freud 2001, 164). However, the sons’ subsequent recognition that no one of them had the power to take the place of the father led them to create a sacred totem with which to identify him and to reinstate the practice of the exogamy which the parricide was designed to abolish: the creation of the totem yielded a totem clan within which sexual congress between members was forbidden. The identification of the totem animal with the father arose out of a displacement of the deep sense of guilt generated by the murder, while simultaneously being an attempt at reconciliation and a retrospective renunciation of the crime by creating a taboo around the killing of the totem. “They revoked their deed by forbidding the killing of the totem, the substitute for their father; and they renounced its fruits by resigning their claim to the women who had now been set free” (Freud 2001, 166). This identification, Freud asserted, confirmed the link between neurosis and religion suggested by him in 1907: given that the totem animal represents the father, then the two main taboo prohibitions of totemism, the ban on killing the totem animal and the incest prohibition, “coincide in their content with … the two primal wishes of children [to kill the father and have sexual intercourse with the mother], the insufficient repression or re-awakening of which forms the nucleus of perhaps all psychoneuroses” (Freud 2001, 153).

The parricidal deed, Freud asserted, is the single “great event with which culture began and which, since it occurred, has not let mankind a moment’s rest” (Freud 2001, 168), the acquired memory traces of which underpins the whole of human culture, including, and in particular, both totem and developed religions. Such a view, of course, presupposes the validity of the essentially Lamarckian idea that traits acquired by individuals, including psychological traits such a memory, can be inherited and thus passed through the generations. This was a controversial notion to which Freud, who never fully accepted the Darwinian account of evolution through natural selection, steadfastly adhered throughout his life, in the face of scientific criticism. He also took it as being consistent with Ernst Haeckel’s (1834—1919) view that ontogeny recapitulates phylogeny, that is, that the stages of individual human development repeat that of the evolution of humanity—which he took as scientific justification of his belief that psychoanalytical techniques could be applied with equal validity to the social as to the individual.

The counterpart to the primary taboo against killing or eating the totem animal, Freud pointed out, is the annual totem feast, in which that very prohibition is solemnly and ritualistically violated by the tribal community, and he followed the Orientalist William Robertson Smith (1846—1894) in linking such totem feasts with the rituals of sacrifice in developed religions. Such feasts involved the entire community and were, Freud argued, a mechanism for the affirmation of tribal identity through the sharing of the totem’s body, which was simultaneously an affirmation of kinship with the father. Freud saw no contradiction in such a ritual, holding that the ambivalence contained in the father-complex pervades both totemic and developed religions: “Totemic religion not only comprises expressions of remorse and attempts at atonement, it also serves as a remembrance of the triumph over the father” (Freud 2001, 169). The father is thus represented twice in primitive sacrifice, as god and as totem animal, the totem being the first form taken by the father substitute and the god a later one in which the father reassumes his human identity. The dynamic which operates in totem religions, Freud argued, is sustained by and underpins the evolution of religion into its modern forms, where the need for communal sacrifice to expiate an original sin should also be understood in terms of parricide guilt.

6. Religion and Civilization

In time Freud came to consider that the account which he had given in Totem and Taboo did not fully address the issue of the origins of developed religion, the human needs which religion is designed to meet and, consequently, the psychological motivations underpinning religious belief. He turned to these questions in his The Future of an Illusion (1927; reprinted 1961) and Civilization and its Discontents (1930; reprinted 1962). In the two works he represented the structures of civilization, which permit men to live in mutually beneficial communal relationships, as emerging only as a consequence of the imposition of restrictive processes on individual human instinct. In order for civilization to emerge, limiting regulations must be created to frustrate the satisfaction of destructive libidinal drives, examples of which are those directed towards incest, cannibalism and murder. Even the religious injunction to love one’s neighbor as oneself, Freud argued, springs from the need to protect civilization from disintegration. Given that history demonstrates that man is “a savage beast to whom consideration towards his own kind is something alien” (Freud 1962, 59), the fashioning of a value system based upon the requirement to develop loving relationships with one’s fellow man is a social and cultural necessity, without which we would be reduced to living in a state of nature. For Freud, the principal task of civilization is thus to defend us against nature, for without it we would be entirely exposed to natural forces which have almost unlimited power to destroy us.

Extending his account of repression from individual to group psychology, Freud contended that, with the refinement of culture, the external coercive measures inhibiting the instincts become largely internalized. Humans become social and moral beings through the functioning of the superego in effecting a renunciation of the more antisocial drives: “external coercion gradually becomes internalized; for a special mental agency, man’s super-ego, takes it over and includes it among its commandments… Those in whom it has taken place are turned from being opponents of civilization into being its vehicles” (Freud 1961, 11). However, the effect of such renunciations is to create a state of cultural privation “resembling repression” (Freud 1961, 43), which in order to foster social harmony must in turn be dissipated by sublimation, the creation of substitute satisfactions for the drives.

Professional work, Freud argued, is one area in which such substitutions take place, while the aesthetic appreciation of art is another significant one; for art, though it is inaccessible to all but a privileged few, serves to reconcile human beings to the individual sacrifices that have been made for the sake of civilization. However, the effects of art, even on those who appreciate it, are transient, with experience demonstrating that they are insufficiently strong to reconcile us to misery and loss. For that effect, in particular for the achievement of consolation for the suffering and tribulations of life, religious ideas become invoked; these ideas, he held, consequentially become of the greatest importance to a culture in terms of the range of substitute satisfactions which they provide.

The role which religion has played in human culture was thus described by Freud in his 1932 lecture “On the Question of a Weltanschauung” as nothing less than grandiose; because it purports to offer information about the origins of the universe and assures human beings of divine protection and of the achievement of ultimate personal happiness, religion “is an immense power, which has the strongest emotions of human beings at its service” (Freud 1990, 199). Since religious ideas thus address the most fundamental problems of existence, they are regarded as the most precious assets civilization has to offer, and the religious worldview, which Freud acknowledged as possessing incomparable consistency and coherence, makes the claim that it alone can answer the question of the meaning of life.

For Freud, then, the cultural and social importance of religion resides both in reconciling men to the limitations which membership of the community places upon them and in mitigating their sense of powerlessness in the face of a recalcitrant and ever-threatening nature. In this respect again, Freud held, group psychology is an extension of individual psychology, with the powerful father figure in patriarchal monotheistic religions providing the required protection against the threat of destruction: “Now that God was a single person, man’s relations to him could recover the intimacy and intensity of the child’s relation to his father” (Freud 1961, 19). It is in this sense, he argued, that the father-son relationship so crucial to psychoanalysis demands the projection of a deity configured as an all-powerful, benevolent father figure.

Genetically, Freud argued, religious ideas thus owe their origin neither to reason nor experience but to an atavistic need to overcome the fear of an ever-threatening nature: “[they] are not precipitates of experience or end results of thinking: they are illusions, fulfilments of the oldest, strongest and most urgent wishes of mankind. The secret of their strength lies in the strength of those wishes” (Freud 1961, 30). In declaring such ideas illusory Freud did not initially seek to suggest or imply that they are thereby necessarily false; an illusory belief he defined simply as one which is motivated in part by wish-fulfillment, which in itself implied nothing about its relation to reality. He gives the example of a middle-class girl who believes that a prince will marry her; such a belief is clearly inspired by a wish-fantasy and is unlikely to prove justified, but such marriages do occasionally happen. Religious beliefs, he suggested in The Future of an Illusion, are illusions in that sense; unlike delusions, they are not, or are not necessarily, “in contradiction with reality” (Freud 1961, 31). However, by the time he wrote Civilization and its Discontents he was prepared to take his religious skepticism a stage further, explicitly declaring religious beliefs to be delusional, not only on an individual but on a mass scale: “A special importance attaches to the case in which [the] attempt to procure a certainty of happiness and a protection against suffering through a delusional remolding of reality is made by a considerable number of people in common. The religions of mankind must be classed among the mass-delusions of this kind” (Freud 1962, 28).

Given that religion has, as Freud acknowledged, made very significant contributions to the development of civilization, and that religious beliefs are not strictly refutable, the question arises as to why he came to consider that religious beliefs are delusional and that a turning away from religion is both desirable and inevitable in advanced social groupings. The answer given in Civilization and its Discontents is that, in the final analysis, religion has failed to deliver on its promise of human happiness and fulfillment; it seeks to impose a belief structure on humans which has no rational evidential base but requires unquestioning acceptance in the face of countervailing empirical evidence: “Its technique consists in depressing the value of life and distorting the picture of the real world in a delusional manner—which presupposes an intimidation of the intelligence” (Freud 1962, 31). He took this as confirming his belief that religion is akin to a universal obsessional neurosis generated by an unresolved father complex and is situated on an evolutionary trajectory which can only lead to its general abandonment in favor of science. “If this view is right,” he concluded, “it is to be supposed that a turning-away from religion is bound to occur with the fatal inevitability of a process of growth, and that we find ourselves at this very juncture in the middle of that phase of development” (Freud 1961, 43). That Freud saw the movement from religious to scientific modes of understanding as a positive cultural development cannot be doubted; indeed, it is one which he saw himself facilitating in a process analogous to the therapeutic resolution of individual neuroses: “Men cannot remain children for ever; they must in the end go out into ‘hostile life’. We may call this education to reality. Need I confess to you that the sole purpose of my book is to point out the necessity for this forward step?” (Freud 1961, 49).

In Civilization Freud mentions that he had sent a copy of The Future of an Illusion to an admired friend, subsequently identified as the French novelist and social critic Romain Rolland. In his response, Rolland indicted broad agreement with Freud’s critique of organised religion, but suggested that Freud had failed in his attempt to identify the true experiential source of religious sentiments: a mystical, numinous feeling of oneness with the universe, “a sensation of ‘eternity’, a feeling as of something limitless, unbounded—as it were, ‘oceanic’” (In Freud 1962, 11). The occurrence of this feeling, Rolland argued, is a subjective fact about the human mind rather than an article of faith; it is common to millions of people and is undoubtedly “the source of the religious energy which is seized upon by the various Churches and religious systems” (In Freud 1962, 11). Thus, he suggested, it would be entirely appropriate to count oneself as religious “on the ground of this oceanic feeling alone, even if one rejects every belief and every illusion” (In Freud 1962, 11). In that sense, he concluded, there is an important sense in which Freud’s account of the origins of religion missed its mark to a significant degree.

Freud was clearly troubled by Rolland’s challenge, confessing that it caused him no small difficulty. On the one hand his respect for Rolland’s intellectual honesty made him take seriously the possibility that his analysis of religion might be deficient in failing to take cognizance of mystical feelings of the kind described. On the other hand, he was confronted with the obvious problem that feelings are notoriously difficult to deal with in a scientific manner. Additionally—and perhaps more importantly—Freud admitted to being unable to discover the oceanic feeling in himself, though he was not disposed on that ground to deny the occurrence of it in others. Given that such a feeling exists, even on the scale suggested by Rolland, the only question to be faced, Freud declared, is “whether it ought to be regarded as the fons et origo of the whole need for religion” (Freud 1962, 12).

Dismissing the possibility of accounting for the oceanic feeling in terms of an underlying physiology, Freud’s response was to focus on its “ideational content,” that is, the conscious ideas most readily associated with its feeling-tone. In that connection, he offered an account of the oceanic feeling as being a revival of an infantile experience associated with the narcissistic union between mother and child, in which the awareness of an ego or self as differentiated from the mother and world at large has yet to emerge in the child. In that sense, he contended, it would be implausible to take it as the foundational source of religion, since only a feeling which is an expression of a strong need could function as a motivational drive. The oceanic feeling, he conceded, may have become connected with religion later on, but he insisted that it is the experience of infantile helplessness and the longing for the father occasioned by it which is the original source from which religion derives (Freud 1962, 19).

However, while this analysis of the relation between religion and mystical experience is acknowledged as important and influential, few commentators have deemed it entirely adequate, the self-confessed absence of any direct experience of the oceanic feeling in Freud’s own case seeming to many to have led to an underestimation on his part of the significance of such feelings in the genesis of religion.

A very significant body of literature has since grown up around the idea that religion might have emerged genetically, and derive its dynamic energy, as Rolland suggested, from mystical feelings of oneness with the universe in which fear and anxiety are transcended and time and space are eclipsed. The work of thinkers as diverse as Paul Tillich (1886—1965), Ludwig Wittgenstein (1889—1951) and Paul Ricoeur (1913—2005) in this connection has proven influential and has established an ongoing dialogue between psychology and philosophy/theology (compare Parsons, 1998, 501). Additionally, Freud’s dismissal of the possibility of a physiological approach to mystical experience has been questioned. Recent scientific investigation of the neurophysiological correlates of mystical or spiritual experiences, utilizing magnetic resonance imaging (MRI) and related technologies, while extremely controversial, appears to demonstrate that some deep meditative practices trigger alterations in brain metabolism, occasioning the kind of numinous feelings specified by Rolland (compare d’Aquili, & Newberg 1999, ch. 6; Saarinen 2015, 19).

7. The Moses Narrative: The Origins of Judaic Monotheism

In 1939, while exiled in Britain and suffering from the throat cancer which was to lead to his death, Freud published his final and most controversial work, Moses and Monotheism. Written over a period of many years and sub-divided into discrete segments, two of which were published independently in the periodical Imago in 1937, the book has an inelegant structure. The many repetitions that it contains, coupled with the initial strangeness of the arguments advanced, persuaded some that it was the product of a man whose intellectual powers had fallen into serious decline. The analysis of Judaism offered in the text also evoked a vitriolic response from some quarters and even led to allegations of Jewish self-hatred on Freud’s part. However, in more recent times the book has become recognized as one of the most important in the Freudian canon, offering an innovative contribution to the understanding of the nature of religious truth and of the role played by tradition in religious thought.

The focal point of the work is the figure of Moses and his connection with Egypt, which had exerted a fascination on Freud since his childhood study of the Philippson bible, as evidenced also in his publication of the essay “The Moses of Michelangelo” in 1914. Accordingly, at this late juncture in his life and with the threat of fascist antisemitism looming over Europe, he turned his attention once more to the religion of his forefathers, constructing an alternative narrative to the orthodox Biblical one on the origins of Judaism and the emergence from it of Christianity. Developing a thesis partly suggested by work of the protestant theologian Ernst Sellin (1867—1946) in 1922, Freud argued that the historical Moses was not born Jewish but was rather an aristocratic Egyptian who functioned as a senior official or priest to the Pharaoh Amenhotep IV. The latter had introduced revolutionary changes to almost all aspects of Egyptian culture in the 14th century B.C.E., changing his name to Akhenaten, centralizing governmental administration and moving the capital from Thebes to the new city of Akhetaten. More significantly, he had also introduced a strict new universal monotheistic religion to Egypt, the religion of the god Aton or Aten, in the process outlawing as idolatrous the veneration of the traditional Egyptian polytheistic deities, including the then dominant religion of Amun-Ra, removing all references to the possibility of an afterlife and prohibiting the creation of graven images. He had also proscribed all forms of magic and sorcery, closed all the temples and suppressed established religious practice, thereby undermining the social status and political power of the Amun priests. In Freud’s words, “This king undertook to force upon his subjects a new religion, one contrary to their ancient traditions and to all their familiar habits. It was a strict monotheism, the first attempt of its kind in the history of the world as far as we know and religious intolerance, which was foreign to antiquity before this and for long after, was inevitably born with the belief in one God” (Freud 1939, 34-5). This religion was represented as a universal rather than a local one, reflective of the fact that imperial conquest had extended the Pharaoh’s rule beyond the borders of Egypt into Nubia, Syria and parts of Mesopotamia, which brought with it the novel idea of exclusivity: that the God Aton was not merely the supreme god, but the only god.

These radical innovations were not well received either by the disempowered Amun priestly caste or by the Egyptian general populace; predictably, they produced a fanatical desire for retribution and the return of the traditional religious practices on the part of the priests and the discontented people, “a reaction which was able to find a free outlet after the king’s death” (Freud 1939, 39). Thus, when the Pharaoh died in 1358 B.C.E. the religion of Aton was ruthlessly suppressed in Egypt and Akhenaten became known to his successors as the “heretic king” whose memory they sought to expunge from the historical record. In his narrative, Freud depicts a despairing Moses, a devotee of the Aton religion, seeing “his hopes and prospects destroyed” (Freud 1939, 46), responding to these events by placing himself at the head of an enslaved Semitic tribe which had long been in bondage in Egypt and leading them to freedom across the Sinai. In the process he converted them to an even more spiritualized, rigorous and demanding form of monotheism, which involved the Egyptian custom of circumcision, a symbolic act of submission to the Divine Will.

In the Freudian narrative the onerous demands of the new religion ultimately led his followers to rebel and to kill Moses, an effective repetition of the original father murder outlined in Totem and Taboo, after which they turned to the cult of the volcano god Yahweh. But the memory of the Egyptian Moses remained a powerful latent force until, several generations later, a second Moses, the son-in-law of the Midianite priest Jethro, shaped the development of Judaism by integrating the monotheism of his predecessor with the worship of Yahweh. By this means the guilt deriving from the murder of the original Moses survived in the collective unconscious of the Jewish people and led to the hope of a messiah who would redeem them for their forefathers’ murderous act.

While Freud evidently retained his view of religion as the analogue of an obsessional neurosis, this account now contained the recognition that, as such, its effects are not necessarily pathological, but, on the contrary, can also be socially and culturally beneficial in a marked way. Thus he points out in his narrative that, through the example and guidance of the great prophets, there arose an ethical tradition within Judaism, ultimately traceable back to Moses the Egyptian, which proscribed iconic representation and ceremonial performance, demanding in their place belief and “a life of truth and justice” (Freud 1939, 82), a tradition with which Freud evidently had deep affinity. In his view, the Judaic ethic was one which demanded restrictions on the gratification of certain instincts as being incompatible with its spiritualised view of human nature and dignity, in a manner paralleling that in which the totem laws had imposed the rule of exogamy within the totem clan. Such restrictions, he argued, enabled Jewish culture to flourish and to take on its unique character. The prophets “did not tire of maintaining that God demands nothing else from his people but a just and virtuous life: that is to say, abstention from the gratification of all impulses that according to our present-day moral standards are to be condemned as vicious” (Freud 1939, 187). In this account, the murder of Moses was thus the initial event which provoked a sense of guilt that in turn shaped the ethical content of Judaic monotheism. This guilt, Freud argued, marked what he termed “the return of the repressed” (Freud 1939, 197), the emergence of compulsive patterns of behavior in the life of a social group generated by a dynamic originating in a traumatic event lying in the distant past but mediated and transmitted to the present in covert form by a tradition inspired, and partly shaped, by unconscious memory-traces. “All phenomena of symptom-formation can be fairly described as ‘the return of the repressed’,” he argued; “The distinctive character of them, however, lies in the extensive distortion the returning elements have undergone, compared with their original form” (Freud 1939, 201). This is something, he held, which constitutes an “archaic heritage” that does not need to be reacquired by each generation, but merely to be reawakened, and he charted the development of that heritage by means of an enumeration of the stages by means of which the repressed returns, from the primeval father through to the totem, to the hero, then to the polytheistic gods and finally to the monotheistic concept of a single Highest Being.

On this account, the obsessional sense of guilt governing and shaping the ascetic, highly spiritualized ethic implicit in Judaism has been passed on through the generations, such that it has become the very essence of the Jewish character: “The origin … of this ethics in feelings of guilt, due to the repressed hostility to God, cannot be gainsaid. It bears the characteristic of being never concluded and never able to be concluded with which we are familiar in the reaction-formations of the obsessional neurosis” (Freud 1939, 212). To recognize, through this form of (psycho)analysis, the genesis of the ethical system in the guilt arising from a nefarious historical deed is, he suggested, to free oneself from its obsessive features while simultaneously accepting its entirely human origins. But such a recognition does not entail an abandonment of the core value system, as there is a sense, as Freud acknowledged to be true in his own case, in which that ethical heritage cannot be repudiated once it is acquired.

This narrative account of the rootedness of the Jewish monotheistic tradition in the life and murder of the man Moses captures what Freud believed to be its most essential feature, something “majestic,” an eternal truth, “historic” rather than “material,” that “in primaeval times there was one person who must needs appear gigantic and who, raised to the status of a deity, returned to the memory of men” (1939, 204). For this reason, a number of commentators, in particular, Gresser and Friedman, argue persuasively that the Moses text should be seen as a response to the question posed by many of Freud’s critics after the publication of the Hebrew edition of Totem and Taboo as to the sense in which he remained, as he claimed, “in his essential nature a Jew,” given his psychologically reductive analysis of religion and his perceived hostility to religious orthodoxy. The answer, they suggest, could be offered by him in Moses and Monotheism only in terms of what he saw as essential to Judaism itself, a rigorous, spiritually intellectualized life ethic, centering on the virtues of truth and justice, derived from the man Moses, its human creator, through the work and influence of the prophets (compare Whitebook 2017, 68-9).

In early Christianity, Freud argued, the guilt of Moses’ murder became reconfigured in the Pauline tradition as the notion of an original sin for which atonement must be sought through a sacrificial death, the effect of which was to abolish the feeling of guilt and supplant Judaism with Christianity: “Paul, by developing the Jewish religion further, became its destroyer. His success was certainly mainly due to the fact that through the idea of salvation he laid the ghost of the feeling of guilt” (Freud 1939, 141). Once again, this historical transition was interpreted by Freud in clear Oedipal terms: “Originally a Father religion, Christianity became a Son religion. The fate of having to displace the Father it could not escape” (Freud 1939, 215). However, he held that the advent of Christianity was in some respects a step back from monotheism and a reversion to a covert form of polytheism, with the panoply of saints standing as a surrogate for the lesser gods of pagan antiquity. He accordingly saw the process whereby Christianity supplanted Judaism as comparable to the historical expunging of the monotheistic religion of Aton in ancient Egypt after the death of the Pharaoh Akhenaten: “The triumph of Christianity was a renewed victory of the Amon priests over the God of Ikhnaton” (Freud 1939, 142).

What is arguably of most importance in the Moses narrative is that it constitutes a final effort by Freud to reconcile himself with his own Jewish heritage; as one critic suggests, “Freud uses Moses to re-affirm his loyalty to a people whose religion he does not share but whose claim on him he steadfastly refuses to disavow” (Friedman, 1998, 148). The Jewish people, Freud pointed out, have a self-confidence which springs from the idea of being chosen by God from amongst the peoples of the world, an idea which derives strength from the related notion of participation in the reality of a supreme Deity. But the tenet of the Judaic religion which historically has had perhaps the most significant effect of all, he contended, has been the prohibition, derived from the religion of Aton, of graven images as idolatrous. That forces the believer into worship of a dematerialized God, an abstraction apprehensible only to the intellect, a movement described by Freud as “a triumph of spirituality over the senses” (Freud 1939, 178). This shift from the sensible to the conceptual was, he believed, “unquestionably one of the most important stages on the way to becoming human” (Freud 1939, 180), and it gave a preeminence to abstractions in Jewish intellectual life that made possible some of its key contributions to Western mathematics, science and literature, including, of course, the discipline of psychoanalysis. In that sense, he ultimately recognized that the very science of mind which he had pioneered and with which he sought to expose the Oedipal nature of religion was itself a cultural product of the Judaic religious impulse.

8. Critical Responses

Freud’s utilization of the conceptual apparatus of psychoanalysis in his treatment of religion yields a naturalistic account rooted in psychoanalytic theory which, while being arguably one of the more self-consistent to be found in the modern age, is also one of the most controversial. In its main features it strongly anticipated, and almost certainly influenced, contemporary critiques of religion associated with the “New Atheism” movement of the late 20th and early 21st centuries, such as those of Daniel Dennett, Richard Dawkins, Sam Harris and Christopher Hitchens (1949—2011). The impact of Freud’s psychoanalytical projectionism can also be traced in the development of contemporary radical theology, particularly in the work of Don Cupitt and Lloyd Geering. The responses to it, in turn, occupy a very wide spectrum, from enthusiastic affirmation to condemnatory repudiation. A representative sample of these would include the following.

a. The Anthropological Critique

 The idea of the “primal horde” was derived by Atkinson and Freud from what was no more than a cautious suggestion by Darwin in his Descent of Man that, amongst several possibilities regarding the social organization of “primeval” humans, one was that it might have consisted of small patriarchal groups led by a single dominant male, “each with as many wives as he could support and obtain, whom he would have jealously guarded against all other men” (Darwin 1981, II 362). This suggestion, which became one of the linchpins of Freud’s account of totem religion, has not received scientific corroboration, and it remains questionable whether the idea has any basis in reality (compare Smith, R.J. 2016). Further, the progressivist evolutionary paradigm championed by Freud, with its projection of a universal linear cultural development from the primitive to the civilized, is largely rejected by contemporary ethnologists and social anthropologists, in particular those influenced by the work of Franz Boas (1858—1942). The assimilation of prehistoric humans with contemporary “primitive” humans on which it is based, and the narrative constructed out of that assimilation, is generally regarded as Eurocentric in its presuppositions and as deriving from the mindset of 19th century imperialism (Kenny, R. 2015). Thus, in his influential review of Freud’s Totem and Taboo in 1919, the eminent American anthropologist Alfred L. Kroeber, who was a disciple of Boas, subjected Freud’s account of totemism to an extended and trenchant critique, suggesting that the method employed in it amounted to “multiplying into one another, as it were, fractional certainties … without recognition that the multiplicity of factors must successively decrease the probability of their product” (Kroeber 1920, 51). Kroeber attributed this almost entirely to the reliance by Freud on the speculative approach taken by such nineteenth century ethnologists as Tylor and Frazer; their anthropological work, he stated bluntly, “is not so much ethnology as an attempt to psychologize with ethnological data” (Kroeber 1920, 55). In a less trenchantly-worded retrospective review written 20 years later, Kroeber—who had in the interim spent some time as a practicing lay psychoanalyst—sought to make conceptual space for a reconciliation of Freud’s theory with scientific ethnology by making a distinction between “historical” and “psychological” thinking, suggesting that Freud’s account should be understood as involving the latter rather than the former (Kroeber 1939, 447). However, notwithstanding that, Kroeber’s strongly negative assessment in his original review of Freud’s incursion into the field of scientific anthropology is now generally accepted within the discipline. Accordingly, Freud’s account of totemism, considered as a direct contribution to an understanding of the development of human culture, would now be viewed with considerable suspicion by professional anthropologists.

b. Myth or Science?

For these reasons, Freud’s projectionist theory of religion as evolving from a primal parricide has been called into serious question as a scientific or historical hypothesis, and with it, the status of psychoanalysis itself. Karl Popper (1902—1994) and Ludwig Wittgenstein have both argued against Freud’s repeated claim for the scientific status of psychoanalysis and—by implication—the account of religion which he developed from it. Popper did so on the grounds that the terms in which psychoanalytic theory is couched make it unfalsifiable in principle and thus unscientific. The theories of Freud and Adler, he argued, describe some facts, but “in the manner of myths. They contain most interesting psychological suggestions, but not in a testable form” (Popper 1963, 37), unlike, for example, the propositions of the natural sciences which almost certainly served as a model for Freud. Wittgenstein, who considered Freud to be one of the few contemporary thinkers with “something to say” (Wittgenstein 1966, 41), albeit one whose whole way of thinking “wants combatting” (ibid., 50), was intrigued by Freud’s focus on mythology in his narratives, and saw that much of the persuasive force of his work derived from the claim that it has constructed a scientific explanation of ancient myths. However, he considered that what Freud had effected was of a different order: “What he has done is propound a new myth” (Wittgenstein 1966, 51).

In a similar vein, Paul Ricoeur, in conceding that the primal parricide depicted by Freud is constructed out of ethnological scraps “on the pattern of the fantasy deciphered by analysis” (Ricoeur 1970, 208), proposed that it, and indeed the entire edifice of Freud’s psychoanalytic theory, should itself be read as being essentially mythical rather than scientific. He thus argued that “one does psychoanalysis a service, not by defending its scientific myth as science, but by interpreting it as myth” (Ricoeur 1970, 20). This latter stratagem, with some variations, has subsequently been adopted by a number of other commentators who seek a mechanism to validate the Freudian cultural narrative in the face of its undeniable ethnological shortcomings (compare, for example, Paul, 1996). It is worth noting that Ricoeur’s conception of the mythic is complex, and occurs within the context of his construction of a religious hermeneutics that engages and intersects with the Freudian psychoanalytic one while seeking to go beyond it, a hermeneutics that regards myths not as fables, “but rather as the symbolic exploration of our relationship to beings and to Being” (Ricoeur 1970, 551). On such a view, the deficiencies presented by the Freudian narrative are read as being hermeneutic rather than scientific, open to further articulation and refinement through a more nuanced and balanced interpretation of the symbolic structure of religious discourse.

However, the hermeneutic construal of the Freudian enterprise is itself open to the charge that it fails utterly to acknowledge the over-arching importance attributed by Freud to his claim that psychoanalysis is to be properly regarded as a rigorous science of the mind and has been vigorously critiqued on those and related grounds by Adolf Grünbaum (1923—2018). For Grünbaum, the hermeneutic approach to Freud constitutes a serious distortion of its subject matter and is reflective of an objectionable scientophobia; rather immoderately, he accused it of having “all of the earmarks of an investigative cul-de-sac, a blind alley rather than a citadel for psychoanalytic apologetics” (Grünbaum 1984, 93). By contrast, he insisted on seeing psychoanalysis precisely as a testable theory, but one which is based upon clinical reports from therapeutic practice rather than rigorous experimentally-derived evidence. He pointed out that Freud, whom he considered “a sophisticated scientific methodologist” (ibid., 128), was fully aware of and highly sensitive to the question of the logic of the confirmation and disconfirmation of psychoanalytic interpretations, but contended that his utilization of the notion of consilience in that connection could not meet the demands of full scientific probity. Grünbaum accordingly came to view psychoanalysis as being based upon an inadequate conception of scientific confirmation; the clinical data ostensibly adduced in its favor from therapeutic sessions—which Ernest Jones had described as “the real basis” of psychoanalysis (Jones 1959, 1:3) —are, he argued, the products of a shared influence and are irremediably contaminated by suggestion on the part of the analyst. They cannot therefore properly be regarded as providing confirmatory evidence for the theory, while contemporary psychoanalysis has not met the objection that successful therapy operates as a placebo.

c. Lamarckian vs. Darwinian Evolutionary Principles

As we have seen, Freud’s transposition of the father complex from individual infantile development to the social order relied heavily on Haeckel’s thesis that ontogeny recapitulates phylogeny. The latter is now largely rejected by contemporary science, in particular the manner in which Freudians have adopted it to model the social evolution of human beings analogically with the psychological development of children. Further, it seems evident that Freud’s transposition is deeply problematic and leaves psychoanalysis unable to explain the wide variety of culturally determined personality structures which are demonstrated by contemporary empirical research. Freud’s commitment to Lamarckian evolutionary principles has, of course, also received significant critical comment from the scientific community (Slavet 2009, Ch. 2; Yerushalmi 1993, Ch. 2), though it must be noted that his account of acquired memory traces as being partly constitutive of Jewish identity in Moses and Monotheism owes as much to August Weissmann’s germ-plasma theory of inheritance as it does to Lamarckism (Slavet 2009, 28).

d. The Primordial Religion: Polytheism or Monotheism?

The entire enterprise of accounting for the origins of religion as an evolutionary trajectory from polytheism to monotheism has been challenged by the work of the ethnologist Father Wilhelm Schmidt (1868—1954), whose multi-volume Der Ursprung der Gottesidee (The Origin of the Idea of God; 1912—1955) is a wide-ranging study of primitive religion. In it Schmidt argued that the “original” tribal religion was almost invariably a form of primitive monotheism, focused on belief in a single benevolent creator god, with polytheistic religions featuring at a later stage of cultural development. Schmidt, who was influenced by Boas and his followers, was accordingly critical of evolutionist accounts of religious development, contending that they frequently lack solid grounding in the historical and anthropological evidence, and was dismissive on those grounds of the totemic theory propagated by Freud. It must be added that Freud was aware of Schmidt’s work and was less than impressed by its quality or its scientific impartiality. He saw Schmidt, whom he held partially responsible for the abolition of the journal Rivista italiana di Psicoanalisi in Italy, as an implacable enemy of psychoanalysis, who was motivated by a desire to undermine Freud’s account of the genesis of religion. Freud feared for a possible suppression of psychoanalysis in Vienna in the mid-1930s by the ruling Catholic authorities, with whom Schmidt had considerable influence. That fear, combined with hope—which proved unfortunately ill-grounded—that those authorities might function as a bulwark against the threat of Nazism, persuaded Freud to defer publication of the full text of Moses and Monotheism until after he had taken up residence in England (see Freud 1939, Prefatory Notes to Part 111), a fact which itself had a considerably negative effect on the literary coherence of the work. The substantive issue between Freud and Schmidt on the temporal primacy of polytheism or monotheism remains unresolved and is almost certainly irresolvable; as the theologian Hans Küng puts it, the scientific search for the primordial religion should be called off, as “neither the theory of degeneration from a lofty monotheistic beginning nor the evolutionary theory of a lower animistic or preanimistic beginning can be historically substantiated” (Küng 1990, 70).

e. Religion as a Social Phenomenon

It is instructive to compare Freud’s attempts to deal with the social dimension of religion with that of his near contemporary, the sociologist Émile Durkheim (1858—1917), whose study The Elementary Forms of Religious Life (1995; orig. 1912) has been highly influential, though it should not in any way be seen as a response to Freud. In The Elementary Forms Durkheim set himself the task of analyzing religion empirically as a social phenomenon, holding that such a treatment alone can reveal its true nature. For Durkheim, the social dimension of human life is primary; human individuality itself is largely determined by, and is a function of, social interaction and organization. This was a point missed by Freud, who, we have seen, sought to deal with the social dimension of religion by an extension of psychoanalytical principles from individual to group psychology. What Durkheim termed “social facts” play an important role in his analysis; they are the collective forces external to individuals which compel or influence them to act in particular ways. Such facts exist at the level of society as a whole and arise from social relationships and human associations, and include law, morality, contractual relationships and, perhaps most importantly, religion.

Durkheim defined religion as “a unified system of beliefs and practices relative to sacred things, that is to say, things set apart and forbidden—beliefs and practices which unite in one single moral community called a Church, all those who adhere to them” (Durkheim 1995, 44). He saw the connection between religious beliefs and practices as a necessary one; for him, religious experience is rooted more in the actions associated with rites than it is in reflective thought. Traditional accounts of religion have tended to treat religious beliefs as essentially hypothetical or quasi-scientific in nature—an approach clearly evident in Freud—which almost inevitably raises skeptical doubts about their validity, whereas Durkheim saw that what is important to the believer is the normative dimension of faith. The true function of religion is to deliver salvation by showing us how to live; as such, it originates in and receives legitimation from, moments of “general effervescence” (Durkheim 1995, 213), in which members of a group gather together to perform religious rituals. This often leads the participants into a state of psychological excitement resembling delirium, in which they come to feel transported into a higher level of existence where they make direct contact with the sacred object. Participation in such rituals has the effect of affirming and strengthening the collective identity of the group and must be renewed periodically in order to consolidate that identity.

Durkheim took pains to ensure that his use of terms like “delirium” in such contexts should not be misunderstood: the “delirium” associated with religious rituals is, he stressed, “well-founded”  (Durkheim 1995, 228) in that it is produced by the operation of social factors that are both irreducibly real and crucially important. Given that it is a foundational postulate of sociology that no human institution rests upon an error or a lie, he declared it unscientific to suggest that systems of ideas of such complexity as religions could be delusory or be the product of illusion, as Freud was to do. In that clear functionalist sense, he concluded, all religions are true; “Fundamentally then, there are no religions that are false. All are true after their own fashion: All fulfil given conditions of human existence, though in different ways” (Durkheim 1995, 2).

This vindication of religion in general, however, has as its counterpart a commitment on Durkheim’s part to an account of the nature of sacred objects or gods which was no less egregiously projectionist than Freud’s. If it is impossible for religious belief, considered as a set of representations relating to the sacred, to be erroneous in its own social right, error can and does emerge, he argued, in the interpretation of what those representations mean, even within the framework of a particular culture. At that level, Durkheim conceded, false beliefs are the norm, because all collective representations are delusional and religion is merely a case in point in that regard: “The whole social world seems populated with forces that in reality exist only in our minds” (Durkheim 1995, 228), non-religious examples of which are the meanings attributed by people to flags, to blood and to humans themselves as a class of being. This point regarding the socially-imposed nature of the meanings associated with collective representations can perhaps be most clearly illustrated by reference to now-defunct cultures and religions. For example, while we readily recognize that the Moai, the deeply impressive monolithic statues of Easter Island, unquestionably had a particular political, aesthetic and religious significance for the Rapa Nui people who created them, the meaning of that symbolism largely escapes us—archeological and anthropological reconstruction aside—as we view them from a perspective external to that culture.

Durkheim contended that in a religious context, the sacred object, which is indeed greater than the individual, is nothing more or less than the power of society itself which, in order to be represented symbolically at all, has be objectified through a process of projection. Gods or sacred objects then, are “a figurative expression of … society” (Durkheim 1995, 227); they are society refined, idealized and apotheosized. As such, they represent a power beyond all individual humans, but are ultimately existentially interdependent with them: “while it is true that man is a dependent of his gods, this dependence is mutual. The gods also need man; without offerings and sacrifices, they would die” (Durkheim 1995, 36).

Durkheim’s treatment of religion, then, utilizes a methodology which offers a sharp contrast with Freud’s highly-individualistic, psychological approach to the subject, a contrast which highlights some of the sociological shortcomings of the latter. Unlike Freud, Durkheim also sought to provide an account of religion which achieves full scientific probity while simultaneously doing justice to the richness of the actual lived experiences of believers. Notwithstanding that, however, it seems clear that in the final analysis his anti-skeptical stratagem works satisfactorily only on its own, scientific terms; a believer could scarcely derive comfort from a view which legitimates his belief-system qua sociological fact while implying that the personal God of worship which is its intentional object is, in reality, nothing other than society personified.

f. The Projection Theory of Religion

This raises the whole question of the intellectual plausibility of the projection theory of religion. The question is a complex one, a fact which Freud scarcely acknowledges in his works. As we have seen, the theory, which has a number of related but distinct forms, arose in modernity as a response to the anthropomorphic nature of the attributes which the conceptualization of a personal God in many of the great world religions seems to necessitate. Freud, like Feuerbach, took this as entailing strict anthropotheistic consequences: Feuerbach’s argument reduced God to the essence of man, and Freud sought to go beyond him in offering a psychoanalytical explanation, in terms of the father complex, of why it is human beings have a need to hypostasize their own subjective nature. Belief in God, and the complex patterns of behavior and of rituals associated with that belief, he argued, arise essentially out of the deep psychological need for a Cosmic father.

However, it has been pointed out that such a view underestimates the logical gulf that exists between wishes and beliefs; the former may on occasion be a necessary condition for the latter, but are rarely a sufficient one: an athlete may wish to triumph in an event with every fibre of his being, but that will not necessarily generate a belief that he can do so, much less the delusion that he has done so. Thus, even if it is true that there is a universal wish for a Cosmic father, it is implausible to suggest that such a wish is a sufficient condition for religious belief and the complex practices and value systems associated with it (Kai-man Kwan 2006). Further, as Alvin Plantinga (1932—) argues, in the absence of compelling empirical evidence to support the view that such a universal wish exists, Freud was left with no option but to contend that such wishes are equally universally repressed into the unconscious, a move which opens his theory to the accusation of being empirically untestable (Plantinga 2000, 163).

It is to be noted too that concerns about anthropomorphisms in religious language are in no way restricted to religious skeptics: apophatic or negative theology, for example, grew out of recognition of the logical difficulties implicit in attempts to express the nature of the divine in language. As a result, theologians such as Maximus the Confessor (580—662),  Johannes Scotus Eriugena (815—877) and—in Judaism—Maimonides (1138—1204) repudiated the positive attribution of characteristics to God in favour of “referencing” God exclusively in terms of what He is not, through the via negativa. It is also important to note that some proponents of the projection theory, such as Spinoza and possibly Xenophanes, saw the projection theory as invalidating only those forms of religious belief which are anthropotheistic in nature. Thus projectionism, so far from being hostile to all forms of religious belief and practice, is in fact consistent with themes relating to the avoidance of idolatry long central to the Abrahamic religions in particular, as evidenced in the proscription on naming God in Judaism and in aniconism, the prohibition of figurative representations of the Divine in the early Orthodox Church, in Calvinism and also in Islam (Thornton, 2015: 139-140).

It is thus perfectly consistent to accept projectionism as an account of religious concept formation without thereby repudiating religious belief. Indeed, the logical compatibility of projectionism with religious belief has led some contemporary religious thinkers to go so far as to embrace projectionism as a condition of a reflective religious commitment. The view that religious representations are products of the human imagination, it has been argued, can be accepted implicitly by believers, as the “mark of the Christian in the twilight of modernity is … trust in the faithfulness of the God who alone guarantees the conformity of our images to reality and who has given himself to us in forms that may only be grasped by imagination” (Green, 2000, 15). This argument is closely paralleled by a suggestion from Plantinga that wish-fulfillment as a mechanism could have arisen out of a divinely created human constitution. For while it may not, in general, be the function of wish-fulfillment to produce true belief, that in itself does not rule out the possibility, Plantinga contends—at least for those who believe in God—that humans have been so constituted by the creator to have a deeply-felt need and wish to believe in him. On this view, the very existence of the wish for a transcendent Father may be taken as evidence for the truth rather than the falsity of the beliefs which it inspires: “Perhaps God has designed us to know that he is present and loves us by way of creating us with a strong desire for him, a desire that leads to the belief that in fact he is there” (Plantinga 2000, 165).

Whatever level of plausibility may be assigned to these views, it is in any case clear that the projection theory is also reflective of the difficulties which certain forms of religious discourse generate: the characterization of God as possessing attributes such as Love and Wisdom, however qualified such attributions may be, seems invariably to invite the kind of challenge that is found in Feuerbach, Freud and even in Durkheim. In that sense, the projection theory highlights deep theological and philosophical issues relating to the nature and meaning of religious language. One of the more promising approaches to this issue is that suggested by the work of of Wittgenstein, who, in his Philosophical Investigations (1974), propounded his language-game theory of meaning, which argued that the meaning of any term is determined by its actual use in a living language-system. In that connection, he brought out the complex interplay of linguistic and non-linguistic activities and practices in human life, in a manner analogous to Durkheim’s functionalism. An application of this to religious discourse implies that the latter cannot be understood in isolation from the broad web of cultural practices, beliefs and concerns in which it is imbedded and from which it derives its meaning. This suggests that concerns that skeptical conclusions necessarily follow from our use of human-being predicates in speaking about the Divine are misguided; such concerns gain credence only when accompanied by the deeply pervasive, but uncritical, philosophical assumption—clearly evident in Freud—that the attributions of anthropomorphic predicates to God are to be understood exclusively as factual descriptions of a particular kind, an assumption which is at the very least gratuitous.

This point is made cryptically by Wittgenstein in an indirect allusion to the projection theory: “‘God’s Eye Sees Everything’—I want to say of this that it uses a picture…. [in saying this] I meant: what conclusions are you going to draw? etc. Are eyebrows going to be talked of, in connection with the Eye of God?” (Wittgenstein, 1966, 71). In other words, while in factual discourse references to human eyes have an internal relationship to references to human eyebrows, such that the occurrence of one may and frequently does give rise to the other, no such correlation is possible or necessary in religious discourse about God’s Eye (or Mercy, Anger, Love, and so forth). Thus while “God’s Eye Sees Everything” conjures up the image of a stern, judgmental all-seeing parental figure which, at one level, is amenable to the Freudian father-complex analysis, at another, arguably deeper, level it is clear that the web of relations that holds between the anthropomorphic terms used cannot meaningfully be compared with that which holds in factual discourse about earthly fathers; even the most literal-minded do not seek to speak of God’s eyebrows. The occurrence of anthropomorphisms in religious discourse, then, does not in itself necessitate the acceptance of anthropotheistic conclusions.

g. Moses and Monotheism: Interpretive Approaches

Moses and Monotheism is the most controversial of Freud’s works, seeking as it does to both utilize psychoanalytic theory to reinterpret key historical events and to embed psychoanalysis within a historiographical narrative. Not alone did it contest the orthodox Biblical narrative of the role of Moses in the history of Judaism, it did so at a time when the Jews of Europe were threatened with complete annihilation. It is unsurprising, then, that it should have become the subject of very strong criticism, on the grounds both of methodology and content; indeed, because its central account of the Egyptian origins of Judaic monotheism has seemed so egregiously at odds with both tradition and the historical evidence, much of the critical interest has focused on the question of Freud’s motives in propagating it. The Freudian narrative is, of course, problematic in the extreme when considered as a putative exegesis of the Exodus story; as one commentator puts it, “There is hardly any need to state that Moses and Monotheism does not operate at the level of an exegesis of the Old Testament and in no way satisfies the most elementary requirement of a hermeneutics adapted to a text” (Ricoeur 1970, 545). Though Moses is almost certainly an Egyptian name, the evidence that Moses was an Egyptian is not conclusive and it has also been suggested that his life was not in fact contemporaneous with that of Amenhotep IV (Banks 1973, 411). Freud’s willingness, towards the very end of his life, to construct such an apparently speculative narrative on the very origins of Judaism has long puzzled scholars, but it is possible to distinguish three broad exegetical approaches relating to the Moses text in the secondary literature:

  1. For much of his life Freud presented an image of himself to the world as an urbane, cosmopolitan intellectual, committed to the ideals of secular humanism and modern science, and at times that seemed to necessitate downplaying his Jewish background and education. Some scholars, such as Jones (1957) and, more recently Gay (1987), have accordingly represented the Moses text primarily as a critique of Judaism, a comprehensive application of the reductive analysis of religion offered in Freud’s earlier works to the religion of his forefathers. In a similar vein, Jan Assmann (1998) sees Freud as continuing the more general task, initiated by Baruch Spinoza (1632—1677), of combating monotheism and undoing the negative values, such as intolerance, religious hatred and the configuration of alternative religions as idolatrous, generated by the absolute conception of truth which monotheistic religions seem to require.
  2. The second approach, associated in particular with Yerushalmi (1993), Bernstein (1998) and Slavet (2009; 2010) repudiates what it sees as a confusion of meaning with motivation in the secondary literature regarding Freud’s text, stressing that what is of importance is what Freud sought to convey, not what motivated him to do so. While acknowledging the resonances within the text of personal factors operating in Freud’s life at the time of publication, such as his relationship with the memory of his father, the resurgence of antisemitism and the personal and professional threat presented by Nazism from which he so narrowly escaped, this approach rejects any autobiographical interpretation of the text, focusing instead on Freud’s account of the nature of the Jewish religion and the factors which constitute and determine Jewish identity. Thus Bernstein sees in Freud’s Moses text a powerful new account of religion in general and of Judaism in particular, centering on the idea that a religious tradition derives its dynamic from a complex interplay of conscious and unconscious forces. Slavet attributes to Freud a racial theory of memory and sees Moses and Monotheism as “the culmination of a lifetime spent investigating the relationships between memory and its rivals: heredity, history, and fiction” (Slavet 2009, 7) in the context of the question of “Jewishness.” On this view, Freud sought to show that the advancement of intellectualized spirituality (Geistigkeit) has been the most important part of the legacy of Judaic monotheism, but that this owed as much to the working out of collective trauma, the return of the repressed, as it did to the conscious influence of the patriarchs and prophets.
  3. Finally, there is the semi-autobiographical approach, largely taken in this article, which sees the text as primarily concerned with the long-standing problem for Freud of resolving his personal father complex. That, in psychoanalytical terms, amounted to the implementation of an instance of “deferred obedience” by defining in a positive way his relationship with the religion into which he was born, albeit with an emphasis on the human origins of the Judaic ethic (Rice 1990; Gresser 1994; Friedman 1998).

In a thinker as complex as Freud, these approaches can neither be taken as exhaustive nor as entirely mutually exclusive, as significant textual evidence can be invoked for all three. What seems evident, at any rate, is that Freud was seeking, at that critical point in Jewish history, to affirm his cultural and intellectual indebtedness to the ethical basis of the religion of his forefathers while simultaneously seeking to demonstrate that the validity of that ethic is not contingent upon the Biblical and theological accretions traditionally associated with it. On such a reading, the question of the accuracy of the historical detail in the Freudian narrative becomes as peripheral as it is—on a non-literal interpretation—to that of the Biblical one. The import of the book, as Friedman puts it, may reside ultimately in a purpose which can certainly be discerned in it: to preserve Judaism and articulate Freud’s own Jewish identity at a stage in a historical process in which his people come to progress from worship of a transcendent God “to the rational and self-conscious appreciation of themselves as a people of great accomplishment descended from a great but human leader” (Friedman 1998, 139).

9. References and Further Reading

a. References

  • Alter, R. 1988. The Invention of Hebrew Prose, Modem Fiction and the Language of Realism (Samuel and Athea Stroum Lectures in Jewish Studies). University of Washington Press.
  • Assmann, J. 1998. Moses the Egyptian: The Memory in Western Monotheism. Cambridge, MA: Harvard University Press.
  • Banks, R. 1973. ‘Religion as Projection: A Re-Appraisal of Freud’s Theory’. Religious Studies, vol. 9 (4), 401-426.
  • Berke, J. 2015. The Hidden Freud: His Hassidic Roots. London: Karnac Books.
  • Bernstein, R.J. 1998. Freud and the Legacy of Moses. Cambridge: University Press.
  • Boehlich, W. (ed.) 1992. The Letters of Sigmund Freud to Eduard Silberstein, 1871-1881 (trans. A. Pomerans). Harvard University Press.
  • Brentano, F. 1973 (orig. 1874). Psychology From an Empirical Standpoint (trans. A.C. Rancurello, D.B. Terrell and L.L. McAlister). London: Routledge.
  • d’Aquili, E.G. & Newberg, A.B. 1999. The Mystical Mind: Probing the Biology of Religious Experience. Minneapolis: Fortress Press.
  • Darwin, C. 1981. Descent of Man and Selection in Relation to Sex. Princeton University Press.
  • Durkheim, É. 1995 (orig. 1912). The Elementary Forms of the Religious Life (trans. Karen Fields). New York: Free Press.
  • Feuerbach, L. 1881. The Essence of Christianity, 2nd edition (trans. George Eliot). London: Trübner & Co., Ludgate Hill.
  • Frazer, J. G. 2002 (orig. 1890). The Golden Bough. New York: Dover Publications.
  • Freud, S. 1914 (orig. 1901). The Psychopathology of Everyday Life (trans. A.A. Brill). London: T. Fisher Unwin.
  • Freud, S. 1939. Moses and Monotheism (trans. Katherine Jones). London: The Hogarth Press and Institute of Psycho-Analysis.
  • Freud, S. 1957 (orig. 1910) ‘The Future Prospects of Psychoanalytic Therapy’, in The Standard Edition of the Complete Psychological Works of Sigmund Freud ( & and ed. J. Strachey) Volume X1 (1911-1913). W. W. Norton & Company, 139-151.
  • Freud, S. 1959. ‘An Autobiographical Study’, in The Standard Edition of the Complete Psychological Works of Sigmund Freud (trans. & ed. J. Strachey). Volume XX (1925-1926). London: The Hogarth Press and the Institute of Psychoanalysis, 7-70.
  • Freud, S. 1961 (orig. 1927). The Future of an Illusion (trans. James Strachey). New York; W.W. Norton.
  • Freud, S. 1962 (orig. 1930). Civilization and its Discontents (trans. James Strachey). New York; W.W. Norton.
  • Freud, S. 1976. ‘An Obituary for Professor S. Hammerschlag’, in The Standard Edition of the Complete Psychological Works of Sigmund Freud (trans. & and ed. J. Strachey) Volume IX (1906-1908). W. W. Norton & Company, 255-6.
  • Freud, S. 1976 (orig. 1907). ‘Obsessive Actions and Religious Practices’, in The Standard Edition of the Complete Psychological Works of Sigmund Freud (trans. & ed. James Strachey) Volume IX (1906-1908). W. W. Norton & Company, 115-128.
  • Freud, S. 1986. The Complete Letters of Sigmund Freud to Wilhelm Fliess, 1887-1904 (trans. & and ed. J. Moussaieff Masson). The Belknap Press of Harvard University Press.
  • Freud, S. 1990 (orig. 1933). New Introductory Lectures on Psycho-analysis (trans. James Strachey). New York: W.W. Norton.
  • Freud, S. 2001 (orig. 1913). Totem and Taboo: Some Points of Agreement between the Mental Lives of Savages and Neurotics (trans. James Strachey). Oxford: Routledge Classics.
  • Freud, S. 2010 (orig. 1900, 1908) The Interpretation of Dreams (trans. James Strachey). New York: Basic Books.
  • Friedman, R. 1998. ‘Freud’s Religion: Oedipus and Moses’. Religious Studies, 34 (2), 135-149.
  • Gay, Peter. 1987. A Godless Jew? Freud, Atheism and the Making of Psychoanalysis. New Haven: Yale University Press
  • Goodnick, B. 1992. ‘Jacob Freud’s Dedication to His Son: A Reevaluation’. The Jewish Quarterly Review, Vol. 82 (3-4), 329-360.
  • Green, G. 2000. Theology, Hermeneutics and Imagination: The Crisis of Interpretation at the End of Modernity. Cambridge: Cambridge University Press.
  • Gresser, M. 1994. Dual Allegiance: Freud as a Modern Jew. Albany, NY: State University of New York Press.
  • Grünbaum, A. The Foundations of Psychoanalysis. Berkeley: University of California Press.
  • Hume, D. 1956 (orig. 1757). The Natural History of Religion (ed. H.E. Root). London: A.C. Black.
  • Jones, E. 1957. Sigmund Freud. Life And Work: Volume Three – The Last Phase 1919-1939. London: Hogarth Press.
  • Jones, E. 1959 (ed). Freud: Collected Papers in 5 Volumes (trans. Joan Riviere). New York: Basic Books.
  • Kai-man Kwan. 2006 “Are Religious Beliefs Human Projections?” in Raymond Pelly and Peter Stuart, eds., A Religious Atheist? Critical Essays on the Work of Lloyd Geering. Dunedin, New Zealand: Otago University Press, 41-66.
  • Kenny, R. 2015. ‘Freud, Jung and Boas: the psychoanalytic engagement with anthropology revisited’. Notes and records of the Royal Society of London. Jun 20; 69(2): 173–190. Online: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4424604/
  • Kroeber, A.L. 1920. ‘Totem and Taboo: An Ethnologic Psychoanalysis’, American Anthropologist, New Series, Vol. 22 (1), 48-55.
  • Kroeber, A. L. 1939. ‘Totem and Taboo in Retrospect’. American Journal of Sociology, Vol. 45 (3), 446-451
  • Lang, A. & Atkinson, J.J. 1903. Social Origins and Primal Law. London: Longmans Green.
  • Parsons, W.B. 1998. “The Oceanic Feeling Revisited.” The Journal of Religion, vol. 78 (4), 501–523.
  • Paul, R. A. 1996. Moses and Civilization: The Meaning Behind Freud’s Myth. New Haven; London: Yale University Press.
  • Plantinga, A. 2000. Warranted Christian Belief. Oxford University Press.
  • Popper, K. 1963. Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge.
  • Rice, E. 1990. Freud and Moses: The Long Journey Home. Albany, New York: SUNY Press.
  • Ricoeur, P. 1970. Freud and Philosophy: An Essay on Interpretation (trans. D. Savage). New Haven & London: Yale University Press.
  • Saarinen, J.A. 2015. A Conceptual Analysis of the Oceanic Feeling – With a Special Note on Painterly Aesthetics. Jyväskylä: Jyväskylä University Printing House. Online at: https://jyx.jyu.fi/dspace/bitstream/handle/123456789/45384/978-951-39-6078-0_vaitos07032015.pdf?sequence=1
  • Schmidt, W. 1912-1955. Der Ursprung der Gottesidee: Eine historisch-kritische und positive Studie. (12 vols.) Münster in Westfalen: Aschendorff.
  • Slavet, E. 2009. Racial Fever: Freud and the Jewish Question. Fordham University Press.
  • Slavet, E. 2010. ‘Freud’s Theory of Jewishness For Better and for Worse’. In A.D. Richards (ed.) The Jewish World of Sigmund Freud: Essays on Cultural Roots and the Problem of Religious Identity, 96-111. North Carolina: McFarland & Co.
  • Smith, R.J. 2016. ‘Darwin, Freud, and the Continuing Misrepresentation of the Primal Horde’, Current Anthropology 57 (6), 838-843.
  • Thornton, S. ‘Projection’, In R.A. Segal and K. von Stuckrad (eds.) Vocabulary for the Study of Religion (vol. 3). Leiden/Boston, 2015, 138-144.
  • Tylor, E.B. 1871. Primitive culture: researches into the development of mythology, philosophy, religion, language, art, and custom (2 vols). London: John Murray.
  • Tylor, E.B. 1881. Anthropology: an introduction to the study of man and civilization. London: Macmillan & Co.
  • Whitebook, J. 2017. Freud: An Intellectual Autobiography. Cambridge University Press.
  • Wittgenstein, L. 1966. Lectures & Conversations on Aesthetics, Psychology and Religious Belief (ed. C. Barrett). Oxford: Basil Blackwell.
  • Wittgenstein, L. 1974. Philosophical Investigations (trans. G.E.M. Anscombe). Oxford: Basil Blackwell.
  • Yerushalmi, Y.H. 1993. Freud’s “Moses”: Judaism Terminable and Interminable. Yale University Press.

b. Further Reading

  • Alston, W.P. 2003. ‘Psychoanalytic theory and theistic belief’. In C. Taliafero, & P. Griffiths (eds.). Philosophy of Religion: An anthology (123-140). Oxford: Blackwell Press.
  • Bingaman, K. 2012. Freud and Faith: Living in the Tension. Albany, NY: State University of New York Press.
  • Blass, R.B. 2004. ‘Beyond illusion: Psychoanalysis and Religious Truth’. The International Journal of Psychoanalysis, 85, 615-634.
  • Derrida, J. 1998. Archive Fever: A Freudian Impression (trans. E. Prenowitz). University of Chicago Press.
  • Gay, P. 2006. Freud: A Life for our Time. London: W.W. Norton & Company.
  • Ginsburg, R. et.al. (eds). 2006. New Perspectives on Freud’s Moses and Monotheism (Conditio Judaica) 1st Edition. Tübingen: Max Niemeyer Verlag.
  • Hewitt, M.A. 2014. Freud on Religion. London & New York: Routledge.
  • R.A. 1986. Emile Durkheim: An Introduction to Four Major Works. Beverly Hills, CA: Sage Publications.
  • Kolbrener, W. (2010). ‘Death of Moses Revisited: Repetition and Creative Memory in Freud and the Rabbis’. American Imago, 67 (2), 243-262.
  • Milfull, J. 2002. ‘Freud, Moses and the Jewish Identity’. The European Legacy, vol. 7, 25-31.
  • Nobus, D. 2006. ‘Sigmund Freud and the Case of Moses Man: On the Knowledge of Trauma and the Trauma of Knowledge’. JEP: European Journal of
  • Psychoanalysis: Humanities, Philosophy, Psychotherapies. Number 22 (1). Online at http://www.psychomedia.it/jep/number22/nobus.htm
  • Ofengenden, A. 2015. ‘Monotheism, the Incomplete Revolution: Narrating the Event in Freud’s and Assmann’s Moses’. Symploke, Volume 23 (1-2), 291-307.
  • Palmer, M. 1997. Freud and Jung on Religion. London & New York: Routledge.
  • Said, E. 2004. Freud and the Non-European. London: Verso.
  • Smith, D.L. 1999. Freud’s Philosophy of the Unconscious. Studies in Cognitive Systems, vol. 23. Dordrecht: Springer.
  • Tauber, A.I. 2010. Freud, The Reluctant Philosopher. New Jersey: Princeton University Press.

Author Information

Stephen Thornton
Mary Immaculate College, University of Limerick
Ireland

Frege’s Problem: Referential Opacity

The problem of referential opacity is to explain why a certain inference rule of classical logic sometimes produces invalid-seeming inferences when applied to ascriptions of mental states. The rule concerns substitution of terms for the same object, and here is one of the controversial examples. It involves the mental states of Lois Lane, who believes that Superman can fly. However, she does not know Superman is her coworker Clark Kent, and it is very natural to say that she doesn’t believe that Clark can fly. Yet the inference rule in question apparently allows the following dubious inference:

Superman is identical to Clark Kent.

Lois Lane believes that Superman can fly.

 So, Lois Lane believes that Clark Kent can fly.

This inference rule is commonly called Leibnizʼs Law, or Substitutivity of Identicals, or Identity Elimination. The problem it creates is often designated the problem of referential opacity, but because the word “opacity” promotes a particular theory, this article typically employs the more neutral nomenclature “(apparent) substitution-failure.” The term “Leibnizʼs Law” is used instead for

(1) If x and y are the same object, then x and y have the same properties.

And the terms “Identity Elimination” (“=E”) and “Substitutivity of Identicals” are reserved for the specific rule substitution rule illustrated above.

To formulate this rule precisely, we specify it as a rule of natural deduction. It applies to a major premise, which is an identity sentence (for example, “Superman is identical to Clark Kent”), and a minor premise, which contains at least one occurrence of the term on the left of the major premise. The rule permits replacing at least one such occurrence with the term on the right of the major premise. For example, =E is used to make the following inference:

Istanbul is identical to Constantinople.

 Istanbul straddles Europe and Asia.

So, Constantinople straddles Europe and Asia.

This particular use produces a valid argument. However, applications of the rule in other sentences sometime produces very counter-intuitive results, as illustrated by the case of Lois Lane, and so we get the problem of apparent substitution-failure. Philosophers of language disagree about how to explain, or explain away, such seeming failures.

The problem was introduced into modern discussion by Quine (1956, 1961). Important early contributions include Marcus (1961, 1962, 1975) and Smullyan (1948). The papers (Kaplan 1986) and (Fine 1989) are influential engagements with Quine. However, the essential problem was raised in the seminal (Frege 1892), and so it is also known as Fregeʼs Puzzle.

Table of Contents

  1. Identity Elimination and Its Misuses
    1. Quotation
    2. “So-Called”
    3. Modality
  2. The De Re/De Dicto Distinction
    1. Defining the Distinction
    2. Skepticism about the Distinction
    3. The de re and Leibniz’s Law
  3. Frege’s Theory of Substitution-Resistance
    1. The Sense/Reference Distinction Applied to Attitude Ascriptions
    2. The Hierarchy Problem
    3. The Semantic Innocence Objection
    4. Do Name-Senses Exist Anyway?
    5. Alternative Accounts of the Sense of a Name
  4. Hidden-Indexical Semantics
    1. Two Kinds of Hidden-Indexical Theories
    2. Kripke’s Puzzle
  5. Russellianism
    1. Salmon’s Theory
    2. Commonsense Psychology
    3. Saul on Simple Sentences
    4. Richard’s Phone Booth
  6. References and Further Reading

1. Identity Elimination and Its Misuses

A little more formally, the rule of inference =E can be stated as:

Identity Elimination Schema

Major: t1 = t2

Minor: ϕ(t1)

Conclusion: ϕ(t2)

Here t1 and t2 are expressions which refer to entities (for example, proper names of people or cities). ϕ(t1) is a sentence containing at least one occurrence of t1, and ϕ(t2) is a sentence that results from replacing at least one occurrence of t1 in ϕ(t1) with an occurrence of t2, eliminating the “=” of t1 = t2. Recurring ti presumes that ti is univocal throughout, and recurring ϕ presumes that the sentential context ϕ is not altered, syntactically or semantically, by the replacement. If these uniformity conditions are not met, then the inference scheme is being misapplied, and it is no wonder that false conclusions are derivable. For example, in the inference “The man behind Fred = the man in front of Bill; the man behind Fred saw him leave; therefore, the man in front of Bill saw him leave,” the context “saw him leave” is not uniform, since substitution of “the man behind Fred” by “the man in front of Bill” changes the reference of “him” (Fine 1989:222–3; Linsky 1967:104).

In discussing the problem with apparent substitution-failure by using =E, many examples will be drawn from the fictional story of Superman, treated as if it were true. In the story, a child from the planet Krypton, Kal-El, is sent to Earth, where physical conditions cause him to acquire superpowers. Wearing specific clothing (red cape, blue jumpsuit), Kal-El prevents disasters, rescues endangered innocents, and foils would-be perpetrators of crimes, such as Lex Luthor. People call Kal-El “Superman” when talking about Kal-El’s actions of this kind.

But Kal-El also takes a day job as a reporter, using the name “Clark Kent.” A coworker, Lois Lane, treats him with indifference in the office, but has a pronounced crush on, as she would put it, Superman, unaware they are the same individual.

The problematic examples discussed below involve ascriptions of mental states to Lois (or occasionally Lex), arrived at by applying the rule =E to the major premise “Superman is Clark” and a carefully chosen minor premise. Lois has a crush on Superman (minor premise), so, by =E, Lois has a crush on Clark. But this latter seems false, and would certainly be rejected by Lois herself. Also, Lois believes that Superman can fly, but does not seem to believe that Clark can; she hopes to see Superman again soon, but seems not much to care when she next sees Clark; she would like a date with Superman, but apparently has no interest in one with Clark; and so on. For a problematic use of =E, consider this paradigm example:

(2)
a. Superman is Clark Kent.                                      Major
b. Lois believes that Superman can fly.             Minor
c. ∴ Lois believes that Clark Kent can fly.         a, b =E

It is not a solution to the problem of referential opacity to say that when we apply the rule in an instance like (2), the flaw is that the major premise is one that Lois does not realize is true. No doubt her ignorance explains psychologically why she does not draw the conclusion that Clark can fly, in those very words, but it does not explain semantically how the inference rule can carry us from two truths to a seeming falsehood: “Lois realizes (2a) is true” is not itself a premise for the application of the rule in (2), so its falsehood is irrelevant to what is dubious about the application. Indeed, the rule enables the inference that Lois does realize (2a) is true: simply change the minor premise of (2) to “Lois realizes Superman is Superman,” surely unobjectionable once she has acquired the name “Superman” from watching Kal-El perform heroic deeds.

Some terminology is commonly encountered in discussions of cases like (2). Mental-state ascriptions like (2b) and (2c) are called attitude ascriptions, since the subject is being ascribed a mental attitude. When the thing the attitude is toward is specified by a “that”-clause (or by a clause complementized by “if” or “whether”), the ascription is called a propositional attitude ascription. This is because the “that”-clause is standardly taken to specify a proposition, the one expressed by the sentence which “that” prefixes (but see, for example, Davidson 1969, Bach 1997, and Moltmann 2003, 2008, 2017 for criticism of this). So (2b) says that Lois has the attitude of belief toward the proposition that Superman can fly. The sentence following the “that” in (2b) and (2c) is called the content-sentence, though in English, “that” can often be dropped (it is not obligatory in (2b) and (2c)).

a. Quotation

There is mileage to be gained from the idea that the reason we get counterintuitive instances such as (2) is that the rule of =E is being misapplied in some way, or, relatedly, that the rule as formulated is not a faithful reflection of the motivation provided by Leibniz’s Law, as stated in (1)—a better formulation would have to be misapplied to get (2). There are some well-known cases of misapplication of the rule which motivate critiques of (2) as a relevantly similar misapplication. One sort of case, emphasized by Quine (1961), is

(3)
a. Istanbul is Constantinople.
b. “Istanbul” has eight letters.
c. ∴ “Constantinople” has eight letters.

This is a misapplication of =E because the name “Istanbul” does not occur univocally in (3). In the major premise, it is used in the normal way to refer to a certain city. But in the minor premise, it is not used to refer to that city (perhaps it is not used to refer at all). Rather, it occurs as part of the complex quotation-name “‘Istanbul,’” referring to the name “Istanbul,” not the city Istanbul (this is a Tarskian rather than Fregean account of quotation—see further Richard 1986, Washington 1992, Saka 2006—but the nonuniformity objection to (3) holds on either). (3b) correctly predicates “has eight letters” of the word “Istanbul,” as opposed to unintelligibly predicating “has eight letters” of the city Istanbul. So (3) has no more force than a variant in which the minor premise reads “the first name used in (3a) has eight letters” and the conclusion reads “the second name used in (3a) has eight letters,” and which at best seems to presume the absurd principle that if two names refer to the same thing then they have the same number of letters.

Quine thought examples like (3) instructive. The position of “Istanbul” in (3b) is not open to substitution, like the position of “Superman” in (2b), and “Istanbul” does not seem to be referring normally in (3b), so perhaps the same should be said of “Superman” in (2b): the position “Superman” occupies in (2b) is referentially opaque, hence the terminology. But it is unclear how instructive (3) really is. Quine suggests (1956:186) that we should give “serious consideration” to construing mental state ascriptions such as (2b) as involving quotation. (2b) so-construed would say that Lois believes-true “Superman can fly” as a sentence of English.

But he immediately hedges by adding that this “is not to suggest that the subject speaks the language of the quotation, or any language…We may treat a mouse’s fear of the cat as his fearing-true a certain English sentence.” Unfortunately, we are left in the dark about what it is to believe-true or fear-true a sentence as a sentence of L when one does not know L. Quine then admits that the quotational construal of mental state ascriptions will only yield a “systematic agreement in truth-value…and no more.” But even that is doubtful. If “believes-true … as a sentence of L” is simply jargon for “believes that … is true-in-L,” a monolingual Czech who believes that Superman can fly would not do so according to this analysis (she may not even have heard of English); conversely, she may believe that “Superman can fly” is an example of a sentence that is true in English, because she has been told so by a reliable informant; clearly, this does not mean she believes Superman can fly, since she does not know what “fly” means. (See Church 1950 for a famous discussion of quotational accounts, and Schweizer 1993 for a technical investigation of quotational accounts of modal logic.)

A quotational account that does rather better, Quine notes, is that (2b) says that Lois believes the meaning of “Superman can fly,” which avoids the problem of the monolingual Czech. But then it is not really the presence of quotation that is blocking substitution. For if this new quotational account is correct, (2) is valid reasoning if (2a) guarantees that “Superman can fly” and “Clark can fly” mean the same. So (2)’s being a fallacy will require that (2a) not be sufficient for these two sentences to mean the same. This in turn seems to require an account of names on which names can be coreferential yet, one way or another, differ in meaning; and indeed, some accounts to be considered below pursue this. And then substitution-resistance need not be pinned on the presence of quotation.

b. “So-Called”

Quine has another example of misapplication of =E, but one which tends to undermine the thought that there is something referentially peculiar about the position occupied by the substitution-resistant name (though he appears to regard the example as supporting this idea). His well-known “Giorgione” case (Quine 1961:17) is as follows:

(4)
a. Giorgione is Barbarelli.
b. Giorgione is so-called because of his size.
c. ∴ Barbarelli is so-called because of his size.

In (4), there is nothing unusual about the way in which any of the names is used: in each use, there is simply reference to a certain artist. The reason the inference fails to be a legal application of =E is that the sentential context “is so-called because of his size” does not recur uniformly, since the reference of “so” changes in moving from (4b) to (4c): in (4b), “so” refers to the name “Giorgione,” but in (4c), it refers to the name “Barbarelli.” The supposed application of =E is therefore a simple fallacy of equivocation, brought about by the substitution having a hidden truth-condition-altering side-effect (altering the reference of “so”). But it may be an instructive fallacy, if anything like a covert “so” is present in attitude ascriptions. (For other examples of nonuniformity, see Fine 1989:222–36; for more on “so-called,” Forbes 2006:154–7, Corazza 2010, and Predelli 2010.)

c. Modality

Our last example of misuse of =E involves intensional operators, which are operators which do not allow interchange within their scope of accidentally coextensive expressions (two predicates are coextensive if and only if (iff ) they actually apply to exactly the same things, and accidentally coextensive iff they are coextensive, but there could have been something to which one applies and the other does not; two sentences are accidentally coextensive iff they have the same actual truth-value but could have differed in truth-value). The standard cases of intensional operators are modal operators such as “it is necessary that,” “it is possible that,” and “it is contingent that.”

To illustrate how intensional operators can induce failure of substitution of accidentally coextensive predicates, suppose I have in my garage three cars, all

Bentley racing cars from the 1920s, and that these are the only three in existence (the only three that Bentley ever built). Then for any x, x is a car in my garage iff x is a Bentley racing car. But it surely could have been that a car in my garage is not a Bentley, in the sense that there is a way things could have gone as a result of which a car from a different manufacturer ends up in my garage. By contrast, it is not possible that a Bentley racing car is not a Bentley. The problem is that the two predicates “x is a car in my garage” and “x is a Bentley racing car” are only accidentally coextensive, while modal operators are sensitive to what might be called the “modal profile” of expressions within their scope: the array of semantic values they have, sets in the case of predicates, across ways things could have gone, or “possible worlds.” “x is a car in my garage” and “x is a Bentley racing car” would have the same modal profile iff at each world, the set of things the first applies to is the same set as the set of things the second applies to. But as we have said, there is a possible world w where the set of things one predicate applies to is different from the set of things the other applies to, since there is, say, a Bugatti in my garage in w. As the example shows, attempts to substitute predicates which are not necessarily coextensive within the scope of a modal operator easily go awry, resulting in absurdities such as a Bentley that is not a Bentley: within the scope of “possibly” or “it could have been that,” “car in my garage” cannot be replaced by the accidentally coextensive “Bentley racing car” in the sentence “a car in my garage isn’t a Bentley.”

The same can happen with expressions which are accidentally coreferential. Suppose there are nine planets in our solar system, and that this is a contingent fact: there could have been more or fewer planets (on that definition of “planet”).

Then the following use of =E derives a false conclusion from true premises:

(5)
a. The number of planets = 32
b. It is contingent that the number of planets = 9
c. ∴ It is contingent that 32 = 9.

The conclusion is false because true mathematical identities such as “32 = 9” are the paradigm cases of necessary truths: in every way things could have gone, the number 9 is the outcome when the number 3 is multiplied by itself.

(5) differs from previous examples in that one of the terms in the major premise, “the number of planets,” is not a proper name, but rather what is called a singular definite description: “definite” because “the” coupled with a singular nominal implies exactly one, and “description” because the expression, if it picks out anything, picks out the individual that is the unique satisfier of the descriptive condition “F” in “the F,” in this case “number of planets.”

However, definite descriptions can be classified in at least two ways. One option is that they are treated as belonging to a unitary semantic category of singular terms, together with other grammatical categories such as proper names, demonstratives, and indexicals: expressions of all these types “designate” objects. The classification of definite descriptions with names goes back to Frege (1892). The other approach classifies a definite description “the F” as a first-order quantifier, like “some F,” “each F,” “no F,” and so on (the apparent structural similarity between “the F is G” and “{some/each/no} F is G” is seen as genuine). A quantifier like “some F” is a combination of a det(erminer) “some” with a predicate F, that then combines with a second predicate. In “(det F is G),” “F” is the restriction, or restrictor, in the quantifier “det F,” and “is G” is the quantifier’s scope. In symbols, to take a simple example, “no dog barked” would be represented as “(no x: x is a dog)[x barked],” and so by parallelism, “the dog barked” would be “(the x: x is a dog)[x barked]”: as in English, only det changes as we formalize “the dog barked,” “each dog barked,” “some dog barked,” and so on (for further discussion, see Davies 1981:149–52). (Russell’s Theory of Descriptions (1905) is a quantificational account in the looser sense that Russell took “the F” to be an apparent singular term in need of analysis by the standard determiners some and every. There is also a “predicate” account of some descriptions, as in Fara 2001.)

Only the singular-term account of descriptions raises the problem of referential opacity, for if the descriptions in (5a) are quantifiers rather than singular terms, they are not referential and =E could not be applied in the first place: the major premise is not of the form t1 = t2, but is rather “(the x: Fx)[(the y: Gy)[x = y]].”

However, even if descriptions are singular terms, they may be a special case semantically, which could make (5) not very illuminating about (2). Assuming the singular-term analysis, definite descriptions other than mathematical ones are, apart from certain unusual cases, nonrigid designators: they do not pick out the same object at all possible worlds (Kripke 1972, 1980:48ff). For example, the number nine is the unique satisfier of “number of planets” at the actual world, but in some other possible world, a different (natural) number is the unique satisfier, or, perhaps, there is no satisfier because there are no planets. “32” is the less common case, a rigid definite description: “32” abbreviates “the product of the number three with itself,” and nine uniquely satisfies “product of the number three with itself” at every possible world, since numbers exist in every possible world, “the number three” is another rigid description, and the product operation is the same at every possible world. (As hinted above, there are other ways of cooking up rigid descriptions; see Davies and Humberstone 1980. For more on nonrigidity, see Tichy 2004.)

According to Kripke (1972), proper names, unlike typical descriptions, are rigid designators: they denote the same object with respect to every possible world. To see the case for rigidity, suppose we say that the planet Jupiter could have failed to exist. Here we are talking about a specific heavenly body which in the actual world orbits the Sun between Mars and Saturn, but which, we might say, in certain other possible worlds, is simply never formed, because of different behavior on the part of the original protoplanetary disk, or because a physical universe never comes into existence, or for whatever possible reason. When we say that Jupiter does not exist in such circumstances, we mean to be talking about our relatively familiar planet (it is the third brightest object in the night sky) and saying that it does not exist. So “Jupiter” denotes Jupiter at each possible world w, no matter what happens in w, even failure of Jupiter to exist (see further Salmon 1981:32–40).

It is crucial to problematic uses of =E in the style of (5) that at least one of the singular terms in the major premise be nonrigid. For if they are both rigid and also codesignate, then the minor premise and the conclusion will agree in truth-value. So we might propose a restriction on =E that makes the application in (5) illegal. The weakest restriction motivated by the failure of (5) is that t1 and t2 must have the same modal profile: for each w, either t1 designates the same thing as t2 at w, or neither designates anything at w. A slightly stronger restriction is that t1 and t2 have the same modal profile and at each w, each designates something. Here we are proposing a sui generis addition to the constraints that correct application of =E in modal languages must meet, a constraint that is required because we are treating definite descriptions as singular terms. But allowing application of =E in formal modal languages only if the terms in the major premise have the same modal profile is not workable, since two terms which have the same profile in one interpretation of the language (at each world, they denote the same thing) may have different profiles in another interpretation. So the standard approach is (i) to decree that =E is only applicable when t1 and t2 are proper names, and (ii) in the semantics stipulate that names are always rigid designators. (Some might object that it is illegitimate to sneak semantics into the statement of an inference rule, as the combination of (i) and (ii) does.)

Using “□” for “necessarily,” we can then prove

(6)
c
= d  ⊢ □(c = d),

simply using =E once, with the minor premise “□(c = c),” which is a theorem and therefore does not need to be mentioned on the left in (6). But (using “∃!” for “there exists exactly one”) we will not be able to prove even

(7)
the F
= the G ⊢ □([(∃!x)Fx & (∃!x)Gx] → (the F = the G)),

much less with the unconditional version of the conclusion, “□(the F = the G).” The restriction in =E to names blocks anything like a proof of (7) analogous to that of (6) just mentioned, and there is no way of formulating sound rules for “the” to get round this. So we can classify (5) as a misuse of =E, since in (5a) at least one term is not a proper name.

The relevant question for us is whether there is anything in our discussion to justify the claim that the definite description “the number of planets” occurs opaquely in (5b). As already noted, the idea that “the F” is really a quantifier would have to be rejected before the question whether descriptions are referentially opaque in modal contexts could even arise, since quantifiers are not referential. So for “referentially opaque” to be an accurate characterization of the occurrence of “the number of planets” in (5b), we must take a side, not necessarily the most plausible side, on the singular-term/quantifier issue.

Yet even granting that definite descriptions are singular terms, it is implausible that

“the number of planets” is functioning deviantly in (5b), or in some other way that merits the term “opaque.” In an extensional language, the designation of a definite description in given circumstances is calculated following the semantic structure of the description. For example, “the man who first set foot on the Moon” will designate the unique entity, if there is one, that satisfies both “is a man” and “first set foot on the Moon.” To satisfy “first set foot on the Moon,” such an entity must be the first satisfier of “set foot on the moon,” which in turn has further semantic structure. This evaluation procedure, of following the structure to arrive at a unique object (if there is one), does not change when we move to an intensional language; it is simply that in interpreting an intensional language there are varying circumstances with respect to which an expression can be evaluated. A conjunction A & B may have different truth-values in different circumstances, but no one would accuse “&” of being problematic on account of this. Similarly, the fact that “the F” can have different designations in different circumstances is hardly a cause for concern.

Of course, (5) may seem to indicate a problem; but then, so may the sequent

(8)
A B, ◇(A & C) ⊬ ◇(B & C)

(here “◇” means “possibly”; consider the case where C = ¬B). From (8), we learn that substitution on the basis of accidental equivalence does not work in modal languages, and we must constrain any substitution rule to require necessary equivalence. In the same way, from (5) we learn that substitution on the basis of accidental codesignation is invalid in modal languages, and we must constrain =E to allow its application only if the codesignation is necessary. This is exactly what we have done, by restricting the singular terms of the major premise to individual constants, whose semantics requires them to be rigid designators.

Is there an analogous restriction on =E that we could employ to make the rule acceptable for languages with attitude verbs like “believe”? That t1 = t2 be rigid designators is insufficient, as (2) shows. And we want a condition that does not make it a matter of mere mental compulsion that any thinker in the minor premise’s propositional attitude comes to be in the conclusion’s propositional attitude: it has to be logically guaranteed. Plausibly, nothing weaker than identity of proposition determined by the two “that”-clauses satisfies this demand. So if we agree that a difference in the semantics of the two names would result in the two content-sentences in (2) expressing different propositions, we will have to say that the two names in a use of =E in the likes of (2) must be synonymous.

But it is not clear what it means to apply “synonymous” to a pair of names. Names are not usually found in dictionaries, so the normal notion of synonymy, on which, say, “attorney” and “lawyer” are synonyms in virtue of having the same dictionary definition, will not help. There is also a more serious objection, due to Mates (1952), to the effect that even substitution of dictionary synonyms in attitude ascriptions can produce results not much more comfortable than (5). For example, (9a) below may well be false, yet it seems (9b) could still be true:

(9)
a. I suspect that many people doubt that everyone believes all lawyers are lawyers.
b. I suspect that many people doubt that everyone believes all lawyers are attorneys.

One moral we might draw from “Mates cases” like this is that searching for a criterion which allows substitution of t2 for t1 in attitude reports is likely to be futile. (For further discussion of attitude reports differing by a synonym, see Burge 1978 and Kripke 1979:160–1.)

To summarize, we have considered three incorrect uses of =E, (3), (4), and (5), in the hope that understanding why they go wrong will help us gain clarity about (2). But (3) turned out not to be so useful, given the drawbacks to quotational accounts of attitude ascriptions. (5) suggests trying to modify =E by limiting its use to some favored class of singular terms, but Mates cases cast doubt on whether this line will be productive (see also Kaplan 1969, Section xi). This leaves (4), which shows how a substitution can have a hidden truth-condition-altering side-effect, a paradigm to which we will return.

For the moment, we note a distinction which emerges from the unhelpfulness of (5). (5) illustrates difficulties for =E which arise from the intensionality of certain vocabulary, primarily modal operators, difficulties resolved by a more careful statement of the rule. On the other hand, the difficulties for =E illustrated by (2) do not seem to be resolvable in a similar way. So the problem manifest in (2) is said to arise from the hyperintensionality, or fine-grained intensionality, of psychological vocabulary such as attitude verbs (a context is hyperintensional iff interchange of necessarily coextensive expressions in it can fail). However, even hyperintensional semantics does not necessarily legitimize a qualified version of =E. (For a version of hyperintensional semantics that takes propositions as primitive, see Thomason 1980, Muskens 2005; for a study of some alternatives, see Fox and Lappin 2005; for the use of “impossible worlds” to analyze hyperintensionality, see the exposition and references in Berto 2013; for a derivational account of hyperintensionality, see Bjerring and Rasmussen 2018; and for an argument that “probably” is hyperintensional, see Moss 2018:§7.5).

2. The De Re/De Dicto Distinction

It is possible to get oneself into a frame of mind according to which there is no such thing as hyperintensionality, and the reasoning of (2) is not flawed at all. For if Lois believes that Superman can fly, then, since Superman is Clark, she just does believe that Clark can fly, even though she would not put it that way. What you believe is one thing, which words you are inclined to use when stating your beliefs is another, and if you are ignorant of an identity, you may disprefer or even reject particular wording that nevertheless captures what you believe. So even though Lois would laugh if someone suggested to her that Clark has superpowers (in those very words), she may still believe it.

One view about this argument in favor of (2) is that it is essentially correct. We shall return to this Russellian position later. But a second view is that it exploits an ambiguity that is present in (2b), “Lois believes that Superman can fly,” and in (2c), “Lois believes that Clark can fly.” According to this view, an attitude ascription such as (2b) can be read in a way that permits substitution and in a way that does not. Normally, we understand such ascriptions in the way that does not, which is why we reject (2), but if cajoled enough (“look, she does believe Clark can fly, she just wouldn’t say it like that”), we may switch to a reading that allows substitution. In the usual terminology, this is called the de re reading, contrasting with the more common de dicto reading, which disallows substitution. Other terminology for this reading is relational, contrasting with notional; transparent, contrasting with opaque; and wide scope, contrasting with narrow scope. We turn now to explaining what distinction these labels attempt to mark.

a. Defining the Distinction

None of the above terminology is entirely happy. It is unclear in what sense the substitution-resistant reading of (2b) is any less “about the thing” (“de re”) than a putative substitution-permitting reading, nor is it clear why the truth of (2b) understood in a substitution-resistant way makes the subject of the ascription any less related to the object the attitude is about (Lois believes Superman can fly because she has seen him do it). And “transparent/opaque” employs the notion of opacity, which, if it is not just a synonym for “substitution resisting,” suggests failure to refer in the normal way, an idea we have yet to find a justification for.

But “wide scope/narrow scope” is more useful. The rationale for “wide scope” is the thought that a substitution-permitting reading of (2) can be brought out by a formulation in which the crucial name is moved to a position in front of the attitude verb (it has wide scope with respect to the verb), as illustrated in

(10)
a. Superman is such that Lois believes that he can fly.
b. Superman is someone who Lois believes can fly.

The step from (2b) to (10a) or (10b) is called exportation, and it is intuitively plausible that the exported forms permit substitution: if Superman is someone Lois believes can fly and if Superman is Clark, then indeed Clark is someone Lois believes can fly. So if we read the minor premise and conclusion of (2) in the exported way, we have an explanation of why someone might, under pressure, accept (2) after all. For (2a) and either (10a) or (10b) entail the exported variant of (2c). Note that we are not saying that exportation is valid, for example, that (2b) entails (10a) (though it seems to—for worries about existential commitment of the kind raised in Donnellan 1974, see Forbes 1996:357–62, and more generally Kvart 1984). The point here is just that (2b) and (2c) could be understood straight off in the style of (10), which would explain why (2) might be swallowed.

One advantage of the wide-scope/narrow-scope terminology is that it reflects a difference whose existence is not in doubt, insofar as it is simply syntactic, manifested in the contrast between, say, (2a) and (10a). But of course, there is a question whether the syntactic difference marks any interesting semantic one.

To argue for a semantic difference, we may observe that the same syntactic distinction arises with definite descriptions and (other) quantifiers, where a semantic difference is undeniable. For example, we have

(11)
a. Lois believes the extraterrestrial who works at The Daily Planet likes her.
b. Lois thinks that no extraterrestrial is in this conference room.
c. Lois hopes that someone born on Krypton will come to her aid.

If the quantifiers are given narrow scope, that is, if the examples in (11) are interpreted following word-order, (11a) is false, (11b) is (say) true, and (11c) is false. (11a) is false because Lois does not think that there are any extraterrestrials who work at The Daily Planet, so would not use “The extraterrestrial who works at The Daily Planet likes me” to express any belief of hers. (11b) is true even though

Clark is in the conference room along with Lois and she sees and recognizes him. But since Lois presumes none of her colleagues is an extraterrestrial, she will happily use “No extraterrestrial is in this conference room” to say what she believes about the planetary origins of those in the room. And (11c) is false because (let us suppose) Lois has never heard of the planet Krypton; therefore, she will not think or say “Would that someone born on Krypton comes to my aid!” At least, these are the commonsense verdicts about the examples in (11), based, as is evident, on maintaining a close connection between the content of mental states and their verbal expression by the subject (on which, see Burge 1978:132).

However, these judgments of truth-value reverse themselves when we consider the exported forms:

(12)
a. The extraterrestrial who works at The Daily Planet is someone who Lois believes likes her.
b. No extraterrestrial is someone Lois thinks is in the conference room.
c. Someone born on Krypton is such that Lois hopes that person will come to her aid.

(12a) is true because Clark is the extraterrestrial who works at The Daily Planet and Lois believes Clark likes her; (12b) is false because Clark is an extraterrestrial and Lois thinks Clark is in the conference room; and (12c) is true because Superman was born on Krypton and Lois hopes Superman will come to her assistance. (The intuition that (12a) and (12c) are true and (12b) false suggests that what is required for the truth of, say, (12a), is that Lois have at least one name t of Kal-El such that she expresses a belief of hers with an assertion of “t likes me” literally construed. So the falsehood of (12a) would require her to have no such name; that she will not use “Superman likes me” to express a belief of hers is insufficient for the falsity of (12a).)

Not only does this contrast between (11) and (12) indicate that exportation makes a semantic difference, it also indicates what that difference is. The false cases in (11) are false because they make attitude attributions to Lois using concepts that either she lacks (“born on Krypton”), or thinks empty (“extraterrestrial who works at the Daily Planet”) and so would not employ positively in any belief she has; while the true case, (11b), is true precisely because “no extraterrestrial” is used to specify the content of her belief. In (12), on the other hand, problematic material is kept out of the specification of Lois’s mental states, which allows (12a) and (12c) to be true, while in (12b), we get a falsehood precisely because “no extraterrestrial” functions simply as an objectual quantifier, without characterizing the content of her belief. So in propositional attitude attributions with wide-scope material binding into the content-sentence, the content-sentence only partially characterizes the attitude, while if there is a “closed” content-sentence within the scope of the attitude verb, that is, if there is no exported material, the content-sentence fully characterizes the attitude. And we can then, if we like, resurrect the “de re/de dicto” terminology and use it in the same way as “wide scope/narrow scope.” The hallmark of a de re attribution is not that it says that the subject of the attribution stands in a special relation to the thing the attitude is about, but that the attribution designates or characterizes that thing in a way the ascriber chooses irrespective of whether the subject would accept the characterization, and the subject’s resisting the characterization is not even prima facie reason to think the attribution false; while a contested de dicto attribution is prima facie false. (See further Brogaard 2008:105–7 and Yalcin 2015:210–13; also see Marcus 1962 and Kazmi 1987 on the interpretation of exported quantifiers.)

This gives us a nontendentious way of using “de re/de dicto,” aligned with “wide scope/narrow scope,” that justifies our proposed diagnosis of any inclination to say that (2) passes muster: the diagnosis is that such judgment relies on construing the minor premise and conclusion as if they were in exported form, that is, construing them as de re attributions in the just explained sense. Still, it is worth observing that on this account we are equating the permits-substitution/resists-substitution distinction in the examples in question with a scope ambiguity. This may be too strong: there may be a substitution-permitting reading of, say, (2b), “Lois believes that Clark can fly,” which is not to be explained as involving a wide-scope reading for “Clark.” We will return to this point later, in connection with hidden-indexical semantics.

b. Skepticism about the Distinction

We have arrived at an apparently defensible way of understanding the de re/de dicto distinction, however the distinction is to be employed. We must therefore note that there are expressions of skepticism about it in the literature, for example Dennett (1982), Richard (1990:128–31), Sosa (1970), and Taylor (2002), whose points have not been addressed here. So, let us briefly consider a selection.

Taylor points out that even if using a definite description provides an accurate characterization of what a subject J believes or doubts, in the sense that the content-sentence containing the description echoes the sentence J would produce to express J ’s attitude, an ascriber will in certain cases resist using the description. These are cases where the ascriber thinks that the definite description is improper (a singular definite description the F is improper iff it is not the case that there is exactly one F). Thus, on seeing Smith’s dismembered corpse, Jones may leap to the conclusion that he was murdered and say “Smith’s murderer must be insane”; this is a “whoever that is” use of a description (Donnellan 1966; I am assuming “Smith’s murderer” is a form of “the murderer of Smith”). But if Black knows or believes that Smith was in fact savaged to death by an escaped tiger, she will not make ascriptions like “Jones thinks Smith’s murderer is insane” or “Jones expects the police to capture Smith’s murderer quickly.” This is puzzling if we have the practice of making de dicto ascriptions to reflect the content of the subject’s attitudes, and there is no reason to doubt that Jones’s statement “Smith’s murderer must be insane” expresses in his mouth what he believes (see further Maier 2015).

This reluctance to ascribe may be a result of pragmatic considerations. One reason to think so is that even in the circumstances of the case, it seems that Jones can properly self-ascribe notionally with “I believe Smith’s murderer is insane.” If Black asserts “Jones believes Smith’s murderer is insane” just before realizing she should not, and if “believe Smith’s murderer is insane” is univocal between Black’s ascription and Jones’s self-ascription, the difference in assertibility most probably has to do with the shift in context of utterance, specifically the shift in speaker. One might flesh this out in terms of “the” being a presupposition-trigger, entailing, even when in the scope of normally entailment-canceling operators such as negation, that its restriction is uniquely satisfied, which in our case means that exactly one person murdered Smith. Then since Black knows that Smith was not murdered, she will not say anything that entails that he was. Nonfactive attitude verbs are often said to suppress the triggering (“projection”) of presuppositions (see Kadmon 2001:116), but in view of Taylor’s examples, this may be wrong, or at least too simple.

A weaker pragmatic approach proposes that using a definite description in a belief-ascription conveys (merely) that the ascriber grants or takes the description to be proper. And cooperative speakers who know this do not use descriptions they think improper. So the difference between Black’s ascription and Jones’s self-ascription is explained. The question would then be how this implicature arises.

So far as undermining the idea that there are de dicto or notional ascriptions goes, one might say that the use of presupposition-triggers in the content-sentence creates a principled exception. One would then expect the phenomenon noted above to recur with other triggers. Jones may say “I think I will manage to save enough money,” but Black should not report “Jones thinks that he will manage to save enough money” unless Black grants Jones’s presupposition that saving enough money will be difficult. For if Black knows that the sum is small and that Jones can easily afford it, on this account she would not want to use “manage,” unless ironically.

There is also a question about how manifest the phenomenon that Taylor isolates is with other quantifiers. If Jones says “everyone who attacked Smith will be brought to justice” (he now thinks there were multiple killers), would Black, who knows about the tiger, happily report “Jones thinks everyone who attacked Smith will be brought to justice,” even though Jones says so? If the report seems infelicitous, that may be a point in favor of a pragmatic account if it is combined with a presuppositional account of “every F” in “every F is G.” According to such an account, the restriction F, in this case “person who attacked Smith,” is presupposed to be nonempty (see Heim and Kratzer 1998:159–72).

Sosa (1970) has an interesting example which tries to undercut the de re/de dicto distinction by suggesting that there are no hard-and-fast limits on exportability and so no substantial cognitive relation invoked by the exported form. In an extreme case (Sleigh 1968), if S believes there are spies but only finitely many, and that all have heights but no two have the same height, S may infer and come to believe “the shortest spy is a spy,” and Sosa would allow the exported ascription “the shortest spy is someone S believes is a spy.” So if Phil Kimbly is the shortest spy, Phil Kimbly is someone S believes is a spy (strangely, S, though the most upright of citizens, never thinks of contacting the FBI).

The argument for this laissez-faire stance about exportation is that there are examples where it is perfectly natural. For instance (Sosa 1970:890), the Commanding Officer (CO) may say to the captain, “Tomorrow I want the shortest platoon member to go first” or “I think the shortest platoon member should go first tomorrow.” The CO has no idea who the shortest platoon member is, but in fact it is the unfortunate Smith again (this is before he meets the tiger). The captain knows Smith is the shortest, and says to the sergeant, “The CO wants Smith to go first tomorrow”/“The CO thinks Smith should go first tomorrow,” or to Smith, “The CO wants you to go first tomorrow.” It is perfectly natural for the captain to say such things, yet the ascriptions seem to be arrived at by first exporting a description used by the CO in a whoever-that-is way, and then substituting a name or pronoun. But should not we object to the exporting, on the grounds that the CO does not have a desire or belief or doubt about Smith, that such-and-such? His desire that the shortest platoon-member go first seems to be no more about Smith than S’s belief that the shortest spy is a spy, arrived at as described, is about Phil Kimbly. But why then is “The CO wants Smith to go first tomorrow” so natural?

According to Kripke (2008:348), examples like these are “toy duck” cases: a child in a toy store points at a stuffed animal, asking his mother if it is a goose, and she replies “No, it’s a duck.” Kripke implies that what the mother says, no matter how natural, cannot really be true: “no dictionary should include an entry under ‘duck’ in which ducks…may not be living creatures at all” (346). Another example might be that you and I go to an exhibition of the work of a famous forger who specialized in analytic cubism. Pointing at one of his forgeries on the wall, I ask “Is that a Picasso?”, to which you reply, “No, it’s a Braque.” This is a natural conversation, but the painting is not really a Braque, and we should not explain the use of artists’ names as predicates of their works in a way that permits an NN not to be by NN. Of course, the simplest explanation of the naturalness of these dialogues is that the remarks “It’s a {duck/Braque}” are true, even though the duck is made of artificial fibers and Braque had nothing to do with the Braque (see Partee 2003 for how this could be). So if we follow Kripke in rejecting that explanation, we need to find another. Fortunately, at least in Sosa’s case of “The CO wants Smith to go first tomorrow,” it is not hard to see what the naturalness consists in: Smith is the person whose going first tomorrow will satisfy the CO’s desire that the smallest platoon-member, whoever he is, go first tomorrow; and Smith is the person whose going first tomorrow would realize the quantified eventuality the CO believes should obtain. Rather than leave it up to the sergeant to find out who the relevant individual is, the captain just tells him, and rather than do so by some laborious step-by-step reasoning about how to satisfy the CO’s desire, the captain makes an attitude ascription that is strictly false, but serves both his and the sergeant’s interests in seeing that the CO’s order is obeyed; for to obey the order, an individual has to be identified. By contrast, the Phil Kimbly ascription seems unnatural because there is no surrounding context to give it a rationale. Perhaps we could invent one, but doing so would not turn an incorrect exportation into a correct one, and nor does it in Sosa’s example. An ascription can be well motivated and promote efficiency in communication, but still be literally false.

c. The de re and Leibniz’s Law

Assuming that the de re/de dicto distinction survives skeptical attack, there is one more issue we can address with its aid. At the start of this essay, we distinguished Leibniz’s Law, “if x and y are the same object, then x and y have the same properties,” from the inference rule of =E. Problem cases for the rule might suggest that the Law itself is dubious. Why have we not considered this possibility?

The reason is that the Law is formulated in terms of objects and properties, and to regard examples like (2)–(5) as threats to it, we would have to construe these inferences as specifying properties of objects in their minor premises; but when we do this, we see that the apparent threat to the Law fades, as follows.

(3) is a “wrong object” case, for (3b) ascribes a property to a word, but in (3a) the objects x and y are cities. (4) is a case of failure to specify a property of an object: (4b) seems to involve the property being so-called because of its size, but the italicized phrase fails to specify a property, because of the uninterpretability of its “so”: “so” needs a context, linguistic or otherwise. There is certainly at least one property of objects in the offing, that of having a name which was endowed on the basis of size. But in conformity with the Law, that property is shared with Barbarelli, and the sentence attributing it, “Giorgione has a name endowed on the basis of his size,” falls short of what (4b) says. There is also the property being called “Giorgione” on account of size, but this is shared with Barbarelli too.

As for (5), there is certainly a reading of (5b) in terms of properties of objects: the property of contingently being 9 is ascribed to the number that numbers the planets. But then (5b) is false, since this number is 9, and 9 is not contingently 9. In other words, this property-of-objects construal requires a de re reading of (5b), with the description “the number of planets” exported, resulting in a falsehood.

Another property-of-objects construal of (5b) is one where the property is contingency and the object is the proposition that the number of planets is 9. On this reading, (5b) is true. But this turns (5) into another wrong object case, since in the major premise the objects are numbers, not propositions. And if we change (5a) to make it about propositions, it would have to say that the proposition that the number that numbers the planets is 9 is the same proposition as the proposition that 32 is 9. If (5) is reformulated this way, it is clearly a correct use of =E, but the falsity of the conclusion, that the proposition that 32 is 9 is contingent, means the rewriting of the major premise to state an identity between propositions produced a falsehood: they are not the same proposition at all.

So what of the original (2)? Here the property-of-objects construals of the minor premise are parallel to those in (5), but we do not want to say quite the same things about them. One property-of-objects reading of (2b) is that Superman has the property of being believed by Lois to be able to fly. (2a) is an identity involving Superman, so certainly we can use =E, in this case to infer that Clark has the property of being believed by Lois to be able to fly. This is just a slightly different formulation of the way of understanding the argument that we identified above as underlying an inclination to say that (2) is valid: the crucial point is that the names that are syntactically in the scope of “believes” are interpreted semantically to be exported from its scope. But we do not arrive at (2c), understood as false: that would require importation of “Clark” back into the scope of “believes,” and the fact that (2c) is by default understood as false shows that importation is invalid.

As with (5), we can reconstrue the minor premise and conclusion of (2) to be specifically about propositions. (2b) would then say that the proposition that Superman can fly is believed by Lois, and (2c) would say that the proposition that Clark can fly is believed by Lois. To prevent this just being another wrong-object case, (2a) would then have to be changed to an identity between propositions. Specifically, it would assert that the proposition that Superman can fly is the same proposition as the proposition that Clark can fly. The =E inference is then entirely in accord with Leibniz’s Law. The problem, of course, is that one is inclined to infer that the asserted identity between the propositions is false.

Perhaps we should say, then, that (5) is partially instructive as regards (2), in that there are parallel property-of-objects readings. What (5) does not help with is the formulation of a restriction on the terms used in =E that allows syntactically unstructured individual constants to be substituted in formulations like those actually used in (2); moreover, there seems to be no way to do this.

3. Frege’s Theory of Substitution-Resistance

a. The Sense/Reference Distinction Applied to Attitude Ascriptions

According to the framework for semantics of natural language sketched in Frege (1892), every meaningful phrase of natural language has potentially two sorts of meaning, a reference (Bedeutung) and a sense (Sinn, a cause of many puns in the titles of worthwhile pieces—for example, Dummett 1973 Ch. 17, Burge 1979, Forbes 1990 (if I may), Salmon 1990; for issues about the translations of these German words, see the discussion and references in Kripke 2001:254, n.1). A meaningful expression e, or a use of e, expresses a sense. Its sense determines its reference (if it has a reference) by virtue of being a way of thinking (or “mode of presentation”) of that reference, but whether there is a reference can depend on how things are in the world. In the case of a singular term, the reference is the thing it designates. For example, the sense of the name “Aristotle” might be articulated by “the pupil of Plato who tutored Alexander and wrote the Nicomachean Ethics.” Whether or not the name “Aristotle” has a reference then turns on whether or not there was such a person.

The same is true of sentences. A sentence expresses a thought, or, in current jargon, a proposition, and a proposition with a reference refers to a truth-value, true or false (the idea that propositions refer is a little odd, but see Dummett 1973:180–6). For example, the proposition that Aristotle was a philosopher is a way of thinking of a truth-value: this proposition is the proposition that the pupil of Plato who tutored Alexander and wrote the Nicomachean Ethics was a […] (here readers should substitute their favorite explanation of “philosopher” for the ellipsis, but please, not “one who philosophizes”). Assuming that there was such a person, then this proposition is a way of thinking of true. However, if “Aristotle” lacks a reference because there was no such person, the proposition “Aristotle was a philosopher” will lack a reference because it has a part that lacks a reference.

It is an important point about this apparatus that the calculation of the reference of the whole proposition or sentence expressing it proceeds via the references of the parts. In the case of “Aristotle was a philosopher,” the reference of the whole sentence is obtained by composing the references of “Aristotle” and “was a philosopher,” as determined by their senses, in a way which results in a truth-value. So, it is easiest to think of the reference of “was a philosopher” as a function, one which, applied to an object, produces a truth-value (functions are input-output operations, so in this case the object is the input, the truth-value the output). Then if “Aristotle” provides an object, we will get a truth-value from “was a philosopher.” But if there was no such person, this procedure will hang, waiting for an object when none is going to be provided. This motivates the verdict that in case the name is empty, the sentence is neither true nor false.

a. The Sense/Reference Distinction Applied to Attitude Ascriptions

The sense-reference distinction suggests that we may be able to explain how (13a) below can be true while (13b) is false:

(13)
a. Lois hopes Superman is nearby.
b. Lois hopes Clark is nearby.

Assuming that the names have different senses (perhaps “the red-caped superhero who flies” versus “the mild-mannered Daily Planet reporter with a crush on Lois Lane”), (13a) and (13b) will express different propositions because their embedded content-sentences do, and so (13a) and (13b) at least potentially may refer to (that is, have) different truth-values. But truth-value is at the level of reference, and the corresponding constituents of (13a) and (13b) are all coreferential (given a fixed context to determine what counts as “nearby”). Specifically, the references (truth-values) of (13a) and (13b) are calculated from the references of their three main constituents: (i) “Lois,” referring to Lois; (ii) “hopes,” referring to the hoping relation; and (iii) “Superman is nearby” and “Clark is nearby,” respectively, which refer to the same truth-value. Since (i) and (ii) are common to (13a) and (13b), (13a) and (13b) must also have the same reference, that is, same truth-value, even if they express different propositions by virtue of having content subsentences that express different propositions. So it looks as if Frege’s apparatus does not get us any closer to an account of how (13a) and (13b) might differ in truth-value.

Explanation of references as functions may be extended to expressions other than singular terms and sentences. For example, “hopes” at this point is assumed to refer to a function f that takes a truth-value as input, say the truth-value of “Superman is nearby,” and produces as output another function, g, the reference of the verb-phrase “hopes Superman is nearby.” g takes the referent of the name “Lois” as input and produces the truth-value of (13a) as output. The problem is then that “Superman is nearby” and “Clark is nearby” present the same truth-value to f, which must therefore output the same function g as the referent of the two verb-phrases “hopes Superman is nearby” and “hopes Clark is nearby” (same input requires same output). Thus, Lois is mapped to true by both verb-phrase functions, or to false by both, since they are both the function g; and so (13a) and (13b) are equivalent.

The source of the difficulty is clear: we have taken the reference of “hope” to be a function of the truth-values of content-sentences that follow it. This is not arbitrary, for the calculation of the reference of any complex phrase uses the references of its constituent phrases along the way, and the content-sentence of the ascription does indeed refer to a truth-value, at least when asserted in isolation, or more broadly, when it occurs extensionally, not in an intensional or hyperintensional context. But this is a very unintuitive account of the reference of “hope.” The thing the attitude of hoping is taken toward is surely a proposition, not a truth-value: the proposition that Superman is nearby is what Lois hopes to be true, not the proposition’s truth-value.

So, on the one hand, we want “hope” to take the reference of its complement sentence as its input, because reference is computed from referents. On the other hand, we want “hope” to take the proposition expressed by its complement sentence as its input, because it is propositions whose truth we hope for. But the proposition is the sense of the content-sentence, not the reference.

To solve this conundrum, Frege made a move of what Kaplan called “brilliant simplicity” (Kaplan 1969:117): we attribute to attitude verbs the property of switching the reference of the material that follows in the ascription from the “customary” reference of that material to a different reference, namely, the customary sense (also known as the “indirect” reference). So in (13a), the (customary) reference of “hopes Superman is nearby” is obtained by applying the (customary) reference of “hope” to the reference “Superman is nearby” has in (13a), its indirect reference, that is, its customary sense. Thus, the reference of “hope” gets the proposition that Superman is nearby as input, as we wanted. This means reference is relativized to linguistic context of occurrence. If “Superman is nearby” occurs extensionally, it refers to its truth-value. But if “Superman is nearby” is the S-part of a complex phrase V+(that)S, where V is an attitude verb, “Superman is nearby” refers to its sense, the proposition that Superman is nearby.

On this account, “hope” refers not to a function that takes a truth-value and produces, as the meaning of the verb-phrase “hopes Superman is nearby,” a function that takes individuals (such as Lois) to truth-values. Rather, “hope” refers to a function which takes a proposition as input, for example the proposition that Superman is nearby, though it still produces, as the meaning of the verb-phrase “hopes Superman is nearby,” a function which maps some individuals, like Lois, to true, and others, like Lex Luthor, to false. However, since we have already agreed that “Superman is nearby” and “Clark is nearby” express different propositions (when occurring extensionally, as we would now add) because of the different senses of “Superman” and “Clark,” this means that the input to the reference of “hope” in (13a) is different from its input in (13b): two different propositions, rather than the single truth-value which is all that is available in the absence of the switch in reference of the content-sentences. Consequently, the verb-phrases “hope Superman is nearby” and “hope Clark is nearby” can refer to different functions; “hope Superman is nearby” can refer to a function which maps Lois to true, while “hope Clark is nearby” can refer to a function which maps Lois to false. This is Frege’s account of how (13a) and (13b) can differ in truth-value, and is the first example of what is nowadays called “switcher semantics”(Gluer and Pagin 2006, 2012; Pagin and Westerståhl 2010).

The reference-switch thesis has immediate application to the question of what is wrong with (2). The Fregean answer is that (2) is a fallacy of equivocation. In (2a), “Superman” and “Clark Kent” have their customary referents, namely, Kal-El. But in (2b), “Superman” refers to its customary sense, the concept of being the red-caped superhero who flies; “Clark” also refers to its customary sense. As the example shows, identity of customary reference does not justify substituting one singular term for another in the content-sentence of an attitude attribution, since identity of customary reference falls far short of the identity of indirect reference (identity of sense) that would be needed for (2) to be valid.

Indeed, Frege’s theory predicts that it will be hard to find any nontrivial sound arguments in the style of (2), even if we change the major premise to be of the form “the sense of t1 = the sense of t2.” For then the major premise is true only if two different names have the same sense, and it is not clear under what circumstances that would happen. Perhaps it might be self-evident in the acquisition process that the names refer to the same person: the speaker introduces herself to x with “Hi! My name is Roberta, but people call me Bobbie.” But even if x correctly recalls this, Mates cases can be constructed: x may coherently think that everyone knows Roberta is Roberta but wonder if everyone knows Roberta is Bobbie. Perhaps we should say that for x, for a while, the two names have the same sense, but x envisages that others may use the names with different senses, and the semantics of “everyone knows that Roberta is Bobbie” allows, one way or another, for this possibility. (See also Schiffer’s discussion of the individuation of senses (1992:502–3). For a theory on which senses are never needed to deal with the likes of (2), see Millikan 2000, and for a pro-Fregean critique, Lawlor 2006.)

b. The Hierarchy Problem

There are problems of detail with Frege’s theory. One such is how to accommodate intersubjective variation in sense (see Zalta 2001). But perhaps the best known is the “infinite hierarchies” problem. As we have already seen with Mates sentences, one attitude ascription can be embedded within another. A simple case is:

(14)
a. Kal-El wonders if Lois has begun to notice that Clark is never around when Superman is.
b. Lois has begun to notice that Clark is never around when Superman is.
c. Clark is never around when Superman is.

According to Frege, “Lois has begun to notice that Clark is never around when Superman is” refers in (14a) to the sense it expresses in (14b), since it is within the scope of “wonders” in (14a). And “Clark is never around when Superman is” refers in (14b) to its customary sense, the sense it expresses in (14c) (curiously, the names in (14c) also seem to resist substitution, despite the lack of attitude verbs; we will return to this in our discussion of “simple sentences”). These sentence-senses are obtained systematically from the senses of their constituent words. So in (14b), “Clark” refers to the way of thinking of Kal-El it expresses in (14c), which we label m1. But whenever a word refers, it does so by expressing a way of thinking of that reference. So “Clark” in (14b), referring as it does to m1, must express a way of thinking of m1, which we label m2. Plausibly, m2 cannot be m1 over again, for (i) m2 = m1 would require the same way of thinking to be of both a person, Clark, and of a way of thinking of that person, m1; and, (ii), m2 = m1 means that m1 is a way of thinking of itself, an idea not breathtaking in its intelligibility (see further Peacocke 2009:162–3; but see also Dummett 1973:264–9 for an attempt to get by with just m1). So these considerations motivate the idea that in (14b), “Clark” expresses a way of thinking m2 which is of m1 and not identical to m1.

Now, (14b) occurs in (14a) within the scope of the hyperintensional “wonders,” so its reference in (14a) and the referents of its constituent words in (14a) must switch; they switch from the referents they have in (14b) to the senses they express in (14b). This means that in (14a), “Clark” refers to m2. But then, “Clark” in (14a) must express a sense which is a way of thinking of m2, since this is the only way “Clark” could refer to m2. Call this sense m2. As before, it is implausible that m2 is the same as m2, since, first, it would have to be a way of thinking of itself, and second, it would have to be both a way of thinking of m2, but also, since ex hypothesi it is m2, would have to be a way of thinking of m1. m2, then, appears to be something new.

And so we are off. We can make (14a) the content-sentence of a new attitude ascription, say

(15)
Lex suspects that Kal-El wonders if Lois has begun to notice that Clark is never around when Superman is.

Now the sense (14a) expresses becomes the reference of (14a) in its appearance as the content-sentence of (15), and the words of (14a) will express new senses in (15), ways of thinking of the senses they express in (14a); for example, in (15), “Clark” will express m, a way of thinking of m2, so that “Clark” in (15) can refer to m2. Since there is no principled restriction on how deeply attitude verbs may be embedded within other attitude verbs, we have, apparently, an unending sequence of senses. In particular, “Clark” can express infinitely many ways of thinking, none of which are intelligible beyond the first or second. Some Frege scholars have developed formal models of sense and reference which embody such hierarchies; see, for example, Church (1951) and Anderson (1980). However, others have tried, in effect, to stop at m2; see especially Parsons (1981, 2009).

c. The Semantic Innocence Objection

Problems of detail aside, there are two main objections to Frege’s account which have emerged in the last few decades, the semantic innocence objection and the no-such-thing-as-senses objection. We take the former first.

The semantic innocence objection is so-called because of its famous statement by Davidson (1969:172):

If we could recover our pre-Fregean semantic innocence… it would seem to us plainly incredible that…words [in the content-sentences of attitude attributions] mean anything different, or refer to anything else, than is their wont when they come in other environments.

This is, admittedly, simply an appeal to intuition, but it is a powerful one (see also Loar 1972:43). It is indeed very difficult to detect a switch in the reference of “Superman” if Lois remarks “Superman is nearby, if I’m in luck” versus if she remarks “I hope that Superman is nearby.” The reference-switch thesis also causes problems for the treatment of anaphoric pronouns. In “Galileo thought that the Earth moves, and he knew what he was talking about, so it moves,” it is undeniable that the “it” refers to the Earth. But then the pronoun does not directly inherit its reference from its antecedent (see further Segal 1989). No doubt there are epicycles which get round this, but it is questionable whether that road is worth going down, given the lack of intuitive support at its starting point.

d. Do Name-Senses Exist Anyway?

An even more damaging objection to Frege’s account of substitution-failure for names is that the entities which play the crucial role, senses or ways of thinking of individuals, are chimerical. That Fregean name-senses do not exist is the core argument of Kripke (1972). Briefly, suppose that “Aristotle” does express a reference-determining sense, captured by, say, the singular definite description “the pupil of Plato who tutored Alexander and wrote the Nicomachean Ethics.” One possibility is that this description articulates the meaning of the name in much the way that a dictionary might articulate the meaning of “philosopher.” Then it should be both necessary and a priori that Aristotle tutored Alexander. But it is neither. Aristotle could have been killed in an Athenian traffic accident in his youth, so it is not necessary that he tutored Alexander; and that he did so is clearly an empirical claim, which only historical evidence can confirm or disconfirm. Similarly, not even “if Aristotle and Alexander existed, the former tutored the latter” is necessary or a priori.

A somewhat weaker thesis is that the reference of “Aristotle” is fixed by the description, without being synonymous with it. But even merely this would predict, of some perfectly intelligible statements, that they are semantically problematic. For example (based on Kripke’s “Gödel case,” 1972, 1980:83–5), suppose that someone claims on a fake-news website to have found documents showing that Aristotle was not a pupil of Plato, did not tutor Alexander and did not write the Nicomachean Ethics. The first two items Aristotle deliberately falsified on his CV in order to attract students, and though he published the Nicomachean Ethics under his own name, that was after stealing the manuscript from the true author (not a pupil of Plato), whom he murdered to ensure his silence. And as time passed, the false claims became firmly lodged in popular lore about Aristotle.

If it went viral, this story about Aristotle would outrage historians of philosophy. But the very fact that they would be outraged shows that they understand the story well enough. Yet, if the reference of the name is fixed by the description, the story is self-refuting (if it is true, then it is not true): Aristotle did not lie about tutoring Alexander, for according to the story, “Aristotle” is an empty name, so “Aristotle lied” should be either false or neither true nor false. But no historian would contest the story on the grounds that it is self-refuting: the debate would be over the existence or trustworthiness of the documents that the story is based on. The ability to debate the truth of the story, with both sides treating “Aristotle lied about Plato” as at least debatable, is hard to explain if the reference of “Aristotle” is fixed by the proposed description. And if some other description of the same “famous deeds” sort is substituted, a similar example would surely be constructible.

If the weaker, reference-fixing thesis, does not support attribution of senses to names, perhaps we should go back to the stronger, meaning-giving thesis, and try a different kind of description. Kripke considers modifications like (whoever it is who is) “the person commonly thought to have been a pupil of Plato who tutored Alexander and wrote the Nicomachean Ethics.” He argues that this is vulnerable to counterexamples involving subjects who have not kept up with what is commonly thought about whom (1980:88), and he raises a circularity objection (loc. cit.).

The new description identifies Aristotle as the person commonly thought to be thus-and-so. So there is a certain range of thoughts s1,…,sn had by members of the linguistic community, thoughts of various people to the effect that Aristotle tutored Alexander, Aristotle was taught by Plato, and so on, and these determine the reference of “Aristotle.” But ex hypothesi, “Aristotle” as it occurs in these thoughts means “the person commonly thought to be…,” referring us back once again to s1,…,sn. There is an unending loop here, and we never escape from the thoughts s1,…,sn to a specific object as the reference of “Aristotle.”

Kripke also points out that we manage to refer easily enough even when there are no identifying descriptions we could cite. He gives the example of “Richard Feynman,” a name many people use without having an associated definite description (1980:81—this was before Feynman’s incisive testimony at the Challenger disaster inquiry). An associated indefinite description might be “a famous physicist at Caltech who won the Nobel Prize.” But “a” cannot be strengthened to “the,” since Murray Gell-Mann is also a famous physicist at Caltech who won the Nobel Prize. And if we insert “not identical to Gell-Mann” into the description, we make it impossible to refer to Feynman without having a way of thinking of Gell-Mann (not to get into the looming indeterminacy problem).

e. Alternative Accounts of the Sense of a Name

If Kripke’s arguments show that Fregean senses of names do not exist, then the Fregean solution to the problem of opacity collapses, rather like a well-worked-out theory of human behavior in which demonic possession plays a large and crucial role. However, it would be fair to say that Kripke’s counterexamples tell mainly against “famous deeds” descriptivism and some modifications of it involving qualifiers like “commonly thought.” It is reasonable to focus on famous-deeds descriptions, since Frege says that everyone who uses the name expresses a reference-determining sense with it, and so to guarantee that each individual is in possession of such a sense, one naturally looks to information that is easily come by. But perhaps there are other options for the content of name-senses besides famous deeds.

One alternative, due to Chalmers and developed in the two-dimensional framework of Stalnaker (1987), is two-dimensional sense. A two-dimensional sense is an ordered pair consisting in an epistemic sense and a subjunctive sense. For a name, the epistemic sense is a function from “scenarios” to individuals, and the subjunctive sense is a function from possible worlds to individuals (Chalmers 2011:596–9). A scenario is something like a coherent total description of how things might have turned out to be, and the epistemic sense of a name may be a nonrigid function on such items: in one scenario, a name may refer to x, while in another it may refer to a distinct y. But subjunctive senses are rigid: they denote the same object in any two worlds. The idea is then that epistemic operators are sensitive to the epistemic sense, and modal operators to the subjunctive sense, which, since it is a rigid function, may be identified with the object to which it stably refers (2011:597, T4, T5).

If epistemic senses are just famous-deeds descriptions or their like, Kripke’s objections arise over again. And it would certainly be unfortunate if epistemic and subjunctive senses came apart over actual reference, since then statements like “it’s a posteriori that Aristotle was a philosopher” and “it’s contingent that Aristotle was a philosopher” would be about different people. However, Chalmers has a proposal on which this difficulty and certain others will not arise. Asking what might replace a famous-deeds descriptivist account of how names refer, Kripke suggested a “historical chain” account (1972; 1980:91–4):

[S]omeone, let’s say a baby, is born; his parents call him by a certain name. They talk about him to their friends. Other people meet him. Through various sorts of talk the name is spread from link to link as if by a chain… it’s in virtue of our connection with other speakers in the community, going back to the referent himself, that we refer to a certain man.

The same idea was advanced by Geach (1969:288–9):

[F]or the use of…a proper name there must in the first instance be someone acquainted with the object named…But …the…name…can be handed on from one generation to another… Plato knew Socrates, and Aristotle knew Plato, and Theophrastus knew Aristotle, and so on in apostolic succession down to our own times. That is why we can… use “Socrates” as a name the way we do.

One thing required for x to refer to Socrates with “Socrates” nowadays, then, is that x belong to a linguistic community in which there is an apostolic succession from Socrates to x along which the name “Socrates” is passed. (Following Kripke, x also has to intend to defer in x’s use of the name to those from whom x acquired it—if x decides that “Socrates” would be a fine name for x’s pet turtle, that does not count.)

Kripke mentions that Nozick once remarked to him that if any theory of reference is correct, some descriptivist theory is immune to counterexamples in the style of Naming and Necessity. This would be a descriptivist theory on which the descriptions are theory-laden: they incorporate the reference-determining conditions the correct theory formulates (Kripke 1980:88, n.38). Chalmers exploits this option: taking the historical chain theory as a plausible account of reference-determination, he suggests that the epistemic sense of a name NN might just be “the object NN refers to in the mouths of those from whom I acquired it” or its like (Chalmers 2002:641). This will be a nonrigid function, since in some scenarios, the apostolic succession for “Socrates” will lead to contemporary users but start from an individual x who is not Socrates.

Since the description suggested above involves the term “refer,” there is an obvious circularity worry if the sense is to be reference-determining. Chalmers argues (2002:641–3) that there is no reason to worry, since the evaluation of one person’s epistemic sense takes us back to other people, and their epistemic senses will carry us back to even earlier people, until we arrive at the “initial baptism” introducing the name. The question would then be whether the concept of reference is ineliminably invoked at this point, as in “we hereby name this child NN,” and how significant a problem that would be.

A second question is whether epistemic senses are otiose as far as determining reference is concerned. Is the reason why I can use “Socrates” to refer to Socrates not simply that I belong to a community in which there is a chain of uses of “Socrates” linking me to Socrates in the way the historical chain theory describes, and I have added the name to my repertoire with the intention to use it in a way that preserves the reference of those from whom I acquired it? Perhaps adding the name to my repertoire with such a deferential intention is the very same thing as attaching a theory-laden sense to it. But if not, the postulation of an epistemic sense seems redundant: the reference of the name in my mouth is already determined by my social situation, and if I express a certain epistemic sense with it, that is just a private epiphenomenon.

A second alternative to famous-deeds senses is what we might call “cognitive descriptivism,” since it is based on a (somewhat metaphorical) hypothesis about cognitive architecture. The idea is that we organize our information about what we take to be separate objects that we have encountered into separate mental files, or dossiers. This seems to have first been proposed by Grice (1969:141–4), and was used in an account of the senses of names in Forbes (1990). The neo-Fregean idea is that the sense of a name NN for x is “the subject of this dossier,” where the mental demonstrative “this dossier” refers to the dossier labeled NN by x in x’s mental filing system.

Clearly, questions about circularity and redundancy arise much as they do for two-dimensional sense (see Fine 2007:67–8). If what makes x the subject of the dossier labeled NN is that x is the referent of the name NN, then we have circularity. But if being the subject of the dossier labeled NN consists in—to use the causal theory of Evans (1973)—being the dominant causal source of the information in the dossier, why not cut out the detour through dossiers and just say that the reference of a name NN is the dominant causal source of information that would be expressed in statements of the form “NN is…”? Such issues are pursued in Recanati (2012) and Saka (2018), and are far from settled in the literature. But it is clear from these examples that famous-deeds descriptivism is not in sole possession of the field as an elaboration of Frege’s notion of the sense of a name.

However, whatever viable theory of sense may ultimately be produced, the semantic innocence objection will have to be dealt with. Thomason (1980) is unmoved by it, but we shall next consider accounts of senses that may be invoked by attitude ascriptions in a way that explains failure of =E, yet allows those senses to have their customary references, thereby meeting Davidson’s complaint.

4. Hidden-Indexical Semantics

The reference-switch hypothesis is one version of the more general notion that the words used in the content-sentence of an attitude ascription have a special role that they do not play in other contexts. If the special role does not displace their normal role, we arrive at Loar’s idea of a dual contribution (1972:52–3). On the one hand, as Davidson insists, the words of the content-sentence play their normal role. But there is another semantic mechanism at work in which they are also complicit. There is a wide range of such dual contribution accounts in the modern discussion of opacity, perhaps starting with Loar (1972). Field (1978) has the content-sentence invoking a sentence of the “language of thought.” Bealer (1993) proposes an ambiguity theory, on which the content-sentence of an ascription introduces both an entity composed of the referents of the words, thereby explaining the innocence intuition, and an entity like a Fregean proposition, thereby accounting for the intuition of substitution-resistance in the likes of (2). And Larson and Ludlow (1993) develop a semantics on which a propositional attitude is an attitude to an “interpreted logical form” (ILF) which is a tree structure in which a node is occupied by both the reference of the expression at that node and the expression itself. Consequently, “Superman can fly” and “Clark can fly” are different ILFs simply in virtue of “Superman” and “Clark” being different names.

a. Two Kinds of Hidden-Indexical Theories

Some versions of the dual contribution approach are known as “hidden-indexical” accounts (Schiffer 1979), because of the role context-dependence plays in determining the second contribution of the content-sentence, or because there actually is an indexical expression postulated to occur covertly in the ascription. For example, in Crimmins and Perry (1989) and Crimmins (1992), belief-ascriptions are said to be made true by items supplied by the context in which the ascription is made, items called “unarticulated constituents” because there is no expression in the ascription responsible for their intrusion into the truth-condition. Different but coreferential names may be associated with different normal notions of the same object, and an inference like (2) fails because the substitution changes which normal notion of Kal-El is, in their technical sense, “involved” (there is no reference-switch on the part of the names). Similarly, in Richard (1990), the content-sentence of a belief-ascription invokes a “Russellian annotated matrix” (RAM), which, like an ILF, is an item that contains both Fregean referents and the expressions referring to them, and the truth-condition requires that the RAM in the ascription correlate with a RAM believed by the subject of the ascription. What correlates with what is context-dependent, and (2) fails because substitution need not preserve correlation, even though it preserves Fregean reference (Richard 1990:133–41). While in Forbes (1990, 1996) and Recanati (2000:137–63) there is a hidden “so” in belief-ascriptions, as if “believes” were “so-believes,” which blocks substitution much as it does in Quine’s “Giorgione” case, (4), since the “so” refers to the content-sentence of the ascription.

One respect in which the above theories differ is over what kind of thing is believed. In Schiffer’s general scheme for hidden-indexical theories (1992:503–4), what is believed is a proposition of a non-Fregean kind, but the ascription includes as part of its literal meaning that this proposition is believed under a way w of thinking of it. Here w is something like a Fregean proposition in certain respects, and is specified by the very words used in the content-sentence of the ascription. Substitution then has the side-effect of changing the relevant way of thinking, say from the “Superman can fly”-way to the “Clark can fly”-way, and this opens the door to change of truth-value.

The kind of proposition of which w is a way of thinking is known as a “Russellian” proposition, after a famous exchange between Russell and Frege (Frege and Russell 1904). Frege had claimed that Mont Blanc “with its snowfields” is not itself a component of the thought that Mont Blanc is more than 4,000 meters high, to which Russell replied that “in spite of all its snowfields Mont Blanc itself is a component part of what is actually asserted…a certain complex.” Accounts of Russellian propositions have been given in some detail (for example, Cresswell 1985, Crimmins 1992:117–24; see Jespersen 2003 for critical discussion), and in Schiffer’s scheme, attitude ascriptions invoke quasi-Fregean ways of thinking of such complexes, while the attitude itself is to a Russellian proposition.

In the approach of Forbes (1990, 1996), however, it is a Fregean proposition to which an attitude is held, but one that is specified as the way of thinking of the referent of the content-sentence, where this way is determined by that very sentence. The referent is not a truth-value, as Frege would have had it, but rather an abstract state of affairs, which is a structured entity not unlike a Russellian proposition, though one that fits better into a Fregean scheme. So (2a) becomes

(16)
That Superman can fly is so-believed by Lois or more long-windedly,

(17)
Lois believes her so-labeled way of thinking of the state of affairs that Superman can fly

in which “so” refers to “Superman can fly,” sealing it off from substitution in the same way as it does for “Giorgione” in (4). (17) requires for its truth that the ascriber’s content-sentence be a “linguistic counterpart” of some sentence of Lois’s that she would use to express the belief that (17) is attempting to ascribe to her (compare Richard’s notion of correlation), a belief which is a way of thinking of the state of affairs that Superman can fly (which is equally the state of affairs that Clark can fly and equally the state of affairs that Kal-El can fly).

One problem for (17) is that it requires reference-determining senses, whereas Schiffer-style approaches need not. Additionally, (17) departs from (16) in a rather substantial, if not frequently noticed, way: the “that”-clause disappears, and the clausal form of “believes” is replaced by the transitive one (the direct object in (17) is everything following “believes”). But though there seems to be an equivalence between believing that… and believing the proposition (thought, so-labeled way of thinking) that…, it does not generalize to other attitude verbs. For example, suspecting that Lex Luthor is involved is not the same thing as suspecting the proposition that Lex Luthor is involved (is anyone so paranoid as to suspect propositions?—Moltmann (2003:82) credits Arthur Prior with first noticing this issue). The same thing occurs, though for different reasons in different cases, with such verbs as “announce,” “anticipate,” “ask,” “boast,” “calculate,” “caution,” “complain,” “conclude,” “crow,” “decide,” “detect,” “discover,” “dream,” “estimate,” “forget,” “guess,” “hope,” “insinuate,” “insist,” “interrogate” (literary theory), “judge,” “know,” “notice,” “observe,” “plan,” “prefer,” “pretend,” “rejoice,” “require,” “see,” “suggest,” “surmise,” “suspect,” “understand,” and various cognates of these. The verbs for which the equivalence holds include inference verbs like “deduce” and “infer,” plus a few other examples like “doubt,” “establish,” and “verify.” Unfortunately, it would take us far afield were we to address the issue of how to modify (17) for the verbs for which the equivalence fails (see Forbes 2018 for one account).

As the previous paragraph indicates, some hyperintensional clausal verbs that can be used to ascribe propositional attitudes have hyperintensional transitive forms that can be used to ascribe what we might call objectual attitudes. These seem to generate failures of =E much as their clausal counterparts do. For example, “Lex fears Superman” is true, but “Lex fears Clark” does not seem any more plausible than “Lex fears that Clark will crush him.” The apparatus in (17) can be employed to express a hidden-indexical theory for the transitive verb case: the substitution-resistant reading of “Lex fears Superman” is “Lex fears Superman as such,” or “Lex fears Superman so-personified,” and the references of the “such” and “so” will change if “Clark” replaces “Superman,” producing the false “Lex fears Clark {as such/so-personified}.” A fuller version of the substitution-resisting semantics for “Lex fears Superman” might be

(18)
Lex fears Superman under the way of thinking of him that is so-labeled.

Here “under” forms an adverbial phrase modifying the whole verb-phrase in (18) headed by “fears” (there is some dispute about how such an “under” is to be accommodated; see Schiffer 1996, Ludlow 1996).

Hidden-indexical theories all preserve semantic innocence in roughly the same way: there is some entity, whether Russellian proposition or abstract state of affairs, determined by the customary referents of the words of the content-sentence, so the result is compatible with a Davidsonian decrying of any theory which claims that words in attitude ascriptions abandon their customary referents for something else. The “something else” is involved in a different way, a strategy which (17) and (18) illustrate.

Hidden-indexical semantics also offers an alternative formal account of the de re/de dicto distinction. Standardly, the difference is brought out in terms of scope distinctions, as we did in (10). But another possibility is that de re readings are those in which a hidden-indexical refers only to a part of the content-sentence: if Lois believes that her coworker Mary has gone to St. Petersburg, we may point at Mary and say “Lois believes that that woman is in St. Petersburg,” meaning that she believes some way of thinking of the state of affairs, partially labeled “is in St. Petersburg.” This would explain why the awkward locutions in (10) are rarely encountered in ordinary speech and writing.

b. Kripke’s Puzzle

One application of hidden-indexical semantics is to Kripke’s “puzzle about belief” (1979). Kripke doubts that there is a specific problem of interchange of coreferential names in attitude ascriptions, to be resolved by a semantics on which such substitution is fallacious. Rather, he thinks substitutivity problems are a mere symptom of broader anomalies in psychological discourse (“It would be wrong to blame…substitutivity. The reason does not lie in any specific fallacy [for example in (2)] but rather in the nature of the realm being entered,” 1979:157). So he gives examples meant to bring out anomalies even in the absence of substitution.

His main example is that of a subject, Peter, who encounters the same individual under the same name in different contexts and does not realize it was the same person all the time. Suppose Peter goes to a recital by a pianist named Paderewski, and, picking up the name from the recital program, comes to believe on the basis of the performance that Paderewski has musical talent. Later, at a railway station, he observes an individual surrounded by reporters, and someone tells him “That’s Paderewski, the Polish Prime Minister.” Far from connecting the man he sees with the man he heard play, Peter, who believes that no politician has musical talent, remarks out loud, “Ah, a person of no musical talent, then.” But, of course, Ignacy Jan Paderewski, the Prime Minister of Poland after the First World War, was also a celebrated composer and concert pianist.

Kripke wants us to try to answer the question, “Does Peter, or does he not, believe that Paderewski has musical talent?”, and in the course of our attempting to answer it, to realize that no answer can be given, because of “the nature of the realm being entered.” However, from the Fregean perspective, the example is less troubling, as Kripke recognizes (see also Taschek 1988). Peter has two lexical entries for “Paderewski,” in the same way that the present writer has three for “Socrates”—one for the Ancient Greek philosopher, another for the late Brazilian footballer, and a third for the former Portuguese Prime Minister (the latter two individuals had different first names, but I do not know what they are, and I do not know if the first individual had any other name; on the individuation of names, see Kaplan 1990). Of course, the difference between Peter and myself is that the names in Peter’s two lexical entries are coreferential, while the names in my three are, pairwise, not, unless the footballer, on retiring from the game, moved to Portugal and went into politics.

However, an ascriber A may only have one name for Paderewski (one mental file so-labeled), which puts A at a certain expressive disadvantage relative to Peter, if the ability to make an accurate report about Peter’s beliefs requires A to use names which match Peter’s. A would then need two names for Paderewski. But there is a very natural way around this (which Kripke uses himself, in n.37): A can simply say that Peter believes that Paderewski the pianist has musical talent, while Paderewski the statesman does not (Forbes 1990:561). From the perspective of a semantics like that of (17), the appositive uses of “the pianist” and “the statesman” determine different ways of thinking of the single state of affairs that Paderewski had musical talent. And it is only the way of thinking labeled with Peter’s linguistic counterpart of A’s “Paderewski the pianist has musical talent” that he believes: the appositives help us identify which of Peter’s ways of thinking of Paderewski we wish to invoke in our ascriptions. The question remains to explain why the major premise that Paderewski the pianist is Paderewski the statesman does not license the inference to “Peter believes that Paderewski the statesman has musical talent.” This would partly recapitulate our discussion of (2), though of course the appositives may bring their own complications.

It is also conceivable that ascribers in the know about Peter’s situation, addressing an audience also in the know, can rely on context to fix which belief is ascribed to Peter using “Paderewski has musical talent”; for instance, if the discussion concerns Peter’s evaluations of various pianists, the possessive description “Peter’s so-labeled way of thinking” is proper, rather than improper, since the other way of thinking, labeled with Peter’s linguistic counterpart of “Paderewski the statesman has musical talent,” will not be in the domain of the context, even if the discussion takes place after the railway-station encounter.

One can therefore resist Kripke’s question whether Peter does or does not believe that Paderewski had musical talent, just as I would resist the question “Was Socrates, or was he not, a chain-smoker?” The footballer was, but (I suppose) the philosopher was not, so absent contextual clues I would require disambiguation of the question: “Are you asking whether Socrates the footballer was a chain-smoker, or Socrates the philosopher?” In the Paderewski case, there is no referential ambiguity, but there is still an ambiguity, or indeterminacy, over which way of thinking of the state of affairs in question is being invoked: “Are you asking whether Peter believes Paderewski the pianist has musical talent, or Paderewski the politician?” would be a perfectly proper response. The explanation why it is perfectly proper is clear enough on hidden-indexical theories, but may not be so on others (see also Soames 2002, Chs. 2, 3).

Obviously, this account only works if there is a viable notion of the sense of a name. For those skeptical about the prospects of such a thing, Fine (2007) offers an alternative treatment of the puzzle. Fine begins with an explanation of the difference between “Superman is Superman” and “Superman is Clark”: in “Superman is Superman,” the two names are coordinated, but not in “Superman is Clark.” One manifestation of this is that someone who wonders whether Superman is Superman thereby demonstrates a failure to grasp what is said, while Lex can wonder whether Superman is Clark without demonstrating any failure of understanding. Since Fine takes the coordinated/uncoordinated distinction to be of semantic import, his view could be regarded as neo-Fregean, since he thinks “Superman is Superman” and “Superman is Clark” have different semantics, though his view of how the difference arises is quite unlike Frege’s (see Pickel and Rabern 2017 on some questions that arise for Fine’s account here).

Fine then argues that the case of Peter presents us with a puzzle whose solution is to be formulated in terms of this notion of coordination (2007:100–105). The puzzle is that our normal practices of belief-reporting dictate that we report Peter as believing that Paderewski has musical talent, and that we also report him as believing that Paderewski has no musical talent. At the same time, according to Fine, we do not want to make a “composite” report, that Peter believes that Paderewski has musical talent and believes that Paderewski has no musical talent, since this represents Peter as rather unreflective, which is unjustified (more reflection will not help). Yet the composite report is a simple “and”-Introduction inference from the acceptable reports. How can it sensibly be resisted?

Fine’s suggestion (2007:102–3) is that the composite report is unacceptable precisely because the reporter (who is in the know about Peter’s situation) uses

“Paderewski” in a coordinated way across the content-sentences of the composite report, while Peter does not use coordinated “Paderewski’s” in giving voice to his two beliefs. But the individual reports are acceptable, taken in isolation: there is nothing to be coordinated in an individual report, so we can simply take at face value Peter’s assertion of “Paderewski has musical talent,” even asserted after he has both entries in his lexicon, and ascribe such a belief to him. Whereas, for the Fregean, if there is nothing in the context to point toward one of “Paderewski the pianist” and “Paderewski the statesman” rather than the other, it will be indeterminate what belief is being ascribed (unless some feature of context settles it). And for the Fregean, the composite report, if it is the conjunction of two determinate ascriptions, is acceptable. Perhaps it makes Peter sound unreflective; but so does “The present writer believes Socrates was a chain-smoker and believes Socrates was not (ever) a chain-smoker,” though as I write it, it is true.

5. Russellianism

At the beginning of section 2, we noted that there is a possible response to the appearance of substitution-failure in (2) according to which the reasoning is not flawed at all: if Superman is Clark and Lois believes Superman can fly, she simply does believe that Clark can fly, even though she would not put it that way. The main motivation for this account is the view of propositions advanced by Russell in his letter to Frege quoted above, according to which Mont Blanc itself, not a way of thinking of it, is the sole constituent the name contributes to the proposition about its height. The locus classicus of this theory is Salmon (1986); other prominent contributions include Soames (1987), Saul (1997), and Braun (1998).

a. Salmon’s Theory

According to Salmon, belief-ascriptions invoke both Russellian propositions and ways of taking or of grasping those propositions. The apparently two-place attitude relation of belief unfolds into a three-place relation, with a position for a variable over ways of grasping. So for A believes p, Salmon offers (1986:111)

(19)
for some way of grasping propositions w, A grasps p by means of w and bel(A,p,w).

The correctness of the substitution inference (2) is immediate from this. If (2b) is true, Lois has a way of grasping the proposition that Superman can fly under which she believes this proposition. Ipso facto, she has a way of grasping the proposition that Clark can fly under which she believes this proposition, for it is the same proposition. Thus, (2c) is also true. Ways of grasping may be like Frege’s ways of thinking in some respects, but they are not what is believed, and they are not meant to determine reference.

Also note that Fine’s concern to avoid the composite ascription “Peter believes Paderewski has musical talent and believes Paderewski has no musical talent” is allayed, since the composite ascription is harmless on Salmon’s theory. For it involves two existential quantifiers over ways of grasping: there is some way of grasping the proposition that Paderewski has musical talent under which he believes it (more accurately, bels it), and some way of grasping the proposition that Paderewski has no musical talent, under which he believes it. The second way of grasping is no mere negation of the first, so there is nothing that imputes an intellectual deficiency to Peter (Salmon 1986:130–1).

The main question this account raises is why it seems so clear that there is a way of understanding (2) on which it is invalid. Salmon answers this question by distinguishing between semantically encoded and pragmatically imparted information (Salmon 1986:78). As far as what is semantically encoded is concerned, (2b) and (2c) are the same. But they differ over what they pragmatically convey, and those who think (2b) and (2c) can have opposite truth-values are mistakenly projecting the pragmatic difference onto the semantics. For example, it may be that (2c) pragmatically conveys that Lois believes that “Clark can fly” expresses a truth and that she would assent to it if asked. Loading this into the semantics would be like the mistake made by students in beginning logic classes when they reject “all Fs are G” on being informed that some Fs are G. The defeasible “not all” conveyed pragmatically by “some” obscures their view of the consistency of the two quantified statements.

A different explaining-away of the appearance of falsity in (2c) is provided by Braun (1998). Braun notes that since “Superman can fly” and “Clark can fly” express the same Russellian proposition, (2b) and (2c) express the same Russellian proposition as well. But someone judging (2b) and (2c) may take their common content in one way when judging (2b) and in another when judging (2c), which makes it at least intelligible that they resist the substitution inference.

So, there are things the Russellian can say about conversations among the screenwriters for Superman II, when they agree that at the start of the movie Lois should be shown beginning to suspect that Clark is Superman, and should then confirm that he is, by tricking him when he is personified as Clark into giving himself away. That the screenplay will thereby have Lois beginning to suspect that Clark is Clark, and then tricking him into revealing it, is overlooked by the writers: it never occurs to them (as a non-Russellian would say) that these are the same identity-proposition, taken in different ways.

Russellian propositions are “coarse-grained” compared to Fregean ones, for the latter are individuated in such a way that the propositions that Clark is Clark and that Clark is Superman are two. But once one accepts the distinction between proposition and way of taking the same, it is not clear what limits there are on the coarseness of grain that may be tolerated. There seems to be no obstacle to an unstructured conception of propositions as classes of possible worlds (Lewis 1979; Stalnaker 1984, 1987), and conceivably, it is defensible that true and false are the only propositions. (The same question about how much coarseness of grain is tolerable arises for hidden-indexical theorists who postulate indexically specified ways of thinking of Russellian propositions.)

b. Commonsense Psychology

Another question for Russellianism stems from the main purpose we have in ascribing attitudes: to arrive by abduction at explanations of behavior based on psychological generalizations (“those who believe Superman is present feel safer,” Rupert 2008:83). Someone who (i) feels safer if he believes that Superman is present, and (ii) sees that Clark is present, may still behave nervously or flee, which on the face of it is hard to understand if seeing that Clark is present is the same thing as seeing that Superman is present. Similarly, there are general normative principles of rationality such as

(20)
Anyone who believes a conditional proposition and its antecedent ought to infer its consequent.

This is not to say that such a person ought to believe its consequent: once the consequent is inferred, the thinker has various options, such as rejecting the conditional, or its antecedent, as alternatives to accepting its consequent. But a person who, at a minimum, does not make the inference, betrays a failure of rationality. However, Lex may believe the proposition that if Superman is nearby, then he, Lex, should hide. Lex may then notice and so come to believe that Clark is nearby, but take no steps to conceal himself. Yet if believing that Clark is nearby is the same thing as believing that Superman is nearby (bel-ing a certain proposition via some way of taking it), it seems that we should convict Lex of a failure of rationality, in that he remains unmoved by his two beliefs and so has apparently failed to use modus ponens. (The literature on logic, rationality, and closure under consequence is relevant here; see, for instance, Jago 2009, MacFarlane 2018, Staffel 2018.)

In response to this, Braun (2000) argues that psychological explanation employs ceteris paribus (other-things-equal) principles. For example, even in a case where it is clear to Lex that Superman is nearby, his making no attempt to hide does not mean, say, that he no longer believes he should hide if Superman is nearby, or no longer trusts modus ponens. He will only hide, or try to hide, other things equal. And if he already knows that he is in a location where there are no hiding places, his motivation to seek one is thereby overridden.

So far, this is just commonsense psychology. But according to Braun, there is a special way in which things might not be equal: although a conditional and its antecedent are believed, the antecedent as it occurs as minor premise of the modus ponens and the antecedent as it occurs as a constituent of the major premise may not be grasped in matching ways (2000:209). And if they are not, grounds for anticipating the expected behavior are removed. This means the principle stated in (20) is incorrect as it stands: the correct version would require a “matching ways” restriction. So there is no lapse of rationality on Lex’s part when he fails to use modus ponens in the case where he notices Clark is nearby, and so believes that Superman is nearby, and also believes he should hide if Superman is nearby. For the constituent corresponding to “Superman is nearby” in the way he takes the conditional is different from the way he takes the proposition that Superman is nearby when he comes to believe it once he has noticed that Clark is nearby. Braun admits (2000:234) that he cannot see any other way in which (20) is in need of qualification, so there is a whiff of the ad hoc about his response; but it does allow for a version of (20) acceptable to Russellians.

c. Saul on Simple Sentences

Another prominent defense of Russellianism, due to Saul (1997a, 1997b, 1999, 2007), focuses on “simple sentences,” sentences where we have a strong intuition of substitution-resistance, but there is no sense-invoking expression in the sentence whose semantics might underwrite the intuition. We have already noted one example, (21a) below. The other examples in (21) also manifest the phenomenon:

(21)
a. Clark is never around when Superman is.
b. Clark went into the phone booth and Superman came out.
c. Superman is more successful with women than Clark is.

There is a clear challenge to the Fregean in these examples. The inference in (2) fails, according to the Fregean, because of the semantics of “believes,” which requires its complement content-sentence to behave in a special way: to switch its reference, to make a double contribution to the truth-condition of the whole ascription, or to do whatever else one’s favored account of hyperintensionality proposes. But in the examples in (21), there is no expression which might force analogous behavior on the part of the names. Yet substitution of one name for the other in (21a) and (21c) produces something impossible, so, despite their apparent truth, (21a) and (21c) must be false. And substitution in (21b) seems to alter the meaning enough that the inference fails to be truth-preserving: (21b) appears to require a change of clothing or role, but a single substitution produces something which does not. These examples show that intuitions of substitution-failure do not depend on the presence of psychological vocabulary. And in the absence of anything else to explain them, they show that such intuitions must be mistaken.

Why, then, put any store in corresponding intuitions about (2)? However, hidden-indexical theorists can justify substitution-failure for the examples in (21) if they are willing to extend the scope of hidden-indexical introduction beyond attitude verbs. For instance, perhaps what we mean by (21b) is something along the lines of “Clark, so-attired, went into the phone booth, and Superman, so-attired, came out.” The “so” here accounts for substitution-failure as usual, since the names are associated with distinct ways of dressing: the “Superman” way (dressing as Superman) and the “Clark” way. For other examples, something more general than ways of dressing is needed, and this affords us an opportunity to make a partial unification of the cases of hyperintensional and simple sentences. A more general concept is that of personification, and using it, for (21a) we would have

(22)
Clark, so-personified, is never around when Superman, so-personified, is.

We have the same element of personification in the explanation of why fear of Superman is not the same thing as fear of Clark: to fear Superman, so-personified, is a very different thing from fearing Clark, so-personified (Forbes 2006:166–74).

A possible Fregean view, then, is that (22) is the literal meaning of (21a). According to Braun and Saul (2002) however, the intuition that (21a) can be true rests on some kind of confusion between it and the likes of (22); the latter certainly resists substitution, but differs in meaning from the former precisely because of that. Why would we suffer from such a confusion? Here Braun and Saul make use of the mental files metaphor, but they do not regard it as part of an account of difference in semantic content (see also Rupert 2008). We put information we would naturally express with one name in the file labeled with that name, and information we would naturally express with the other name goes into the file that other name labels. Then in assessing (21c), say, we compare the romantic history recounted in the entries in one file with that recounted in the other, and this task diverts our attention from the fact that the files concern the same individual. The attention-diverting element then explains why we judge (21c) to be true rather than impossible. Braun and Saul draw a parallel with the “Moses illusion” (2002:15–16), in which a large majority of subjects, when asked “How many animals of each kind did Moses take into the Ark?”, respond “Two,” partly because the “how many?” question diverts their attention from their knowledge that in the Bible it was Noah who took animals into his Ark (perhaps this happened to the reader just now).

But such an account cannot apply to speakers and writers who knowingly produce sentences like those in (21). For example, in a review of books about Shostakovich, the historian Orlando Figes wrote, “Shostakovich always signalled his connections to the classical traditions of St. Petersburg, even if he was forced to live in Leningrad” (The New York Review of Books, June 10, 2004, p.14). Far from having his attention somehow diverted from the fact that St. Petersburg is Leningrad, Figes is consciously writing for an audience aware of the identity, since only they will appreciate the rhetorical punch of his remark. And he will certainly resist an editor who proposes to replace “Leningrad” with a second “St. Petersburg,” even though there is nothing hyperintensional about being forced to live somewhere.

Another example comes from an article on the transformation of Eric Blair into George Orwell (Lingua Franca vol.9 #9). The writer of the article is hardly diverted from the fact that Blair is Orwell, since his topic is exactly how one personification came to be abandoned for another in the same individual:

Diffident in private, Blair so feared failure in the literary marketplace that he invented a pseudonym for the book he wrote based on his diaries, Down and Out in Paris and London. Criticism would be directed at George Orwell, not Eric Blair. But since the book, when published in 1933, was a literary success, Eric Blair became George Orwell.

Perhaps, “criticism would be directed at George Orwell, not Eric Blair” is hyperintensional, but “Eric Blair became George Orwell” is not; it clearly resists substitution of “George Orwell,” and it would be absurd to say that the writer only makes the claim because he has allowed himself to lose sight of the fact that Blair and Orwell are the same person.

A third example: a New Yorker cartoon in which Superman, so-personified, is talking to his therapist, and reports, “I’m doing super, but Clark can’t find a paper that’s hiring.” It is unclear who the cartoonist thought would find this funny, but knowing that it is the same person is required to get the joke.

These examples and others (including my favorite, in The New York Times’s “The Philosopher Stripper” article—see Forbes 2006:167–8) show that cases like (21)’s occur outside fiction, and that those who create them do so in full awareness of the relevant identity. That (21a) means what (22) means is certainly the most straightforward explanation of why (21a) is perfectly natural. So substitution-resistance in some simple sentences does not provide as great a threat to the claim of substitution-resistance in (2) as might at first seem, since the mechanisms producing the substitution-resistance may be seen as fundamentally the same in the two cases.

d. Richard’s Phone Booth

The final argument for Russellianism to be considered here is the well-known phone booth case in Richard (1983); I have updated it to cell phones. This example exploits the context-dependence of indexical expressions such as “I,” “here,” and “now.” The phenomenon of indexicality was one on which Frege had pronounced views: he wrote about “I” that (Frege 1967:25–6)

…everyone is presented to himself in a particular and primitive way, in which he is presented to no-one else. So when Dr. Lauben thinks he has been wounded, he will probably take as a basis this primitive way in which he is presented to himself. And only Dr. Lauben can grasp thoughts determined in this way. But now Lauben may want to communicate with others. He cannot communicate a thought which he alone can grasp. Therefore, if he now says “I have been wounded,” he must use “I” in a sense which can be grasped by others, perhaps in the sense of “he who is speaking to you at this moment”….

Whatever one thinks of the last remark, the idea that for each thinker x, “I” can be used by x to express a private first-person way of thinking of x, is one which has persisted since Frege proposed it, and is of course implicitly present in much of the history of philosophy, for example, in Descartes’ cogito. (For further discussion of first-person and more generally indexical and demonstrative thought, see Anscombe 1974, Castaneda 1968, Evans 1981, Lewis 1979, Magidor 2015, Peacocke 1983, 2008 Ch. 3, and Perry 1977, 1979.)

An example in Perry (1979) provides a dramatic illustration. Perry is pushing a grocery cart around the aisles in a store when he comes across a trail of sugar on the floor. He thinks “that person is making a mess” and sets off in pursuit to let them know that a bag of sugar in their cart has burst (“that person” is an example of “deferred ostension,” referring via the sugar trail to the person whose cart the sugar bag is in; see further Borg 2002). His pursuit brings him back to the same point in the store, and he realizes, “I am the one who is making a mess.” This appears to be a new thought, and a Fregean would say it differs from “that person is making a mess” in view of the difference between Perry’s demonstrative way of thinking expressed by “that person” and his first-person way of thinking, “I.”

Fregean first-person ways of thinking are private in the sense that if x and y are distinct thinkers, y cannot employ x’s “I”-way of thinking in y’s thoughts, certainly not as a way of thinking of y. However, this does not stop y from ascribing attitudes to x that require x to be employing x’s own first-person way of thinking (see Peacocke 1981, Percus and Sauerland 2003). y might say that Perry has just realized he himself is the one making a mess, which is to make the ascription “Perry has just so-realized that he himself is the one making a mess.” The ability to describe a Fregean proposition as one that is a special way of thinking of the state of affairs that Perry is making a mess does not imply that the constituents of that proposition are available to the ascriber to use in his or her own thoughts.

But de dicto ascriptions may not always be possible. If Perry says of some store employee, “she knows that I made the mess,” he is not ascribing knowledge to her of the proposition that is his “I made the mess”-labeled way of thinking of the state of affairs that Perry made the mess. From a Fregean point of view, the most Perry can mean is the de re “I am known by her to have made the mess,” since the store employee will probably have identified the culprit demonstratively, “that guy is making the mess,” after following the sugar trail. Perry cannot even ascribe a de dicto demonstrative belief to the employee using “she believes that guy is making a mess” pointing at his own reflection in a mirror. Ascribers using a demonstrative in the content-sentences of their ascriptions are expressing their own demonstrative ways of thinking of the relevant object, not characterizing the subject’s thought, which means that the ascriptions are de re (Forbes 1987:13–15).

Let us now return to Richard’s example. It involves switching contexts (“context-hopping”) and uses Kaplan’s (1989) apparatus to manage context-dependence. In Kaplan’s semantics for context-dependent expressions, sentences are evaluated taken in a context and with respect to a possible world, the circumstances of evaluation (1989:544). A context is a sequence of entities which provides referents for the indexicals and demonstratives in a sentence S and so determines the Russellian proposition S expresses. At a minimum, we would have an agent, a time, a place, and an addressee, to be the referents of “I,” “now,” “here,” and “you,” and an object x to be the referent of a demonstrative or demonstrative pronoun (Kaplan uses “agent” rather than “speaker” to allow for a sentence such as “I am not speaking right now” to be true with respect to silent circumstances). When contexts are systematically related, the truth-values of sentences given fixed circumstances are systematically related. For example, suppose that in circumstances w, X is listening to Y at noon Mountain Time (MT), 11/16/17, and let c be a context with X as its agent, noon 11/16/17 MT as its time, and Y as its addressee.

Then the sentence “I am now listening to you” is true taken in c with respect to w. But if we obtain a new context c* from c by switching agent and addressee, then “I am now listening to you” is false taken in c* with respect to w, since Y is speaking, not listening, to X at noon MT 11/16/17, in w. However, “you are now listening to me” is true taken in c* with respect to w, since “I am now listening to you” taken in c identifies the same state of affairs as “you are now listening to me” taken in c*, the state of affairs that X is listening to Y at noon MT, 11/16/17.

In the circumstances w of Richard’s example, a man a is in his apartment, talking to a woman o on his cell phone. a is also looking out the window onto the street below, where he sees a woman talking on her cell phone. It does not occur to a that the woman he is talking to on his phone might be the woman he is watching through his window; but in fact both are o. Then a notices a man in the street acting suspiciously, apparently trying to sneak up on o from behind. In this situation, a could use “she is in danger” to make a sincere assertion to o on his phone about what he sees. But a would not use “you are in danger” to make a sincere assertion to o speaking into his phone (a might instead open the window and shout down to the street). So in the context c with a as agent, o as phone addressee, and o as the referent of “she,” and taking at face value the facts about what a would and would not say with which referential intention as indicative of what a does and does not believe, the following appear to be true:

(23)
a. I believe she is in danger.
b. I do not believe you are in danger.

But Richard argues (1990:117–8) that (23b) is in fact false; in other words, that a does have a belief he could express by asserting into his phone “you are in danger” with the intention to address the person he is talking to. For if we now consider a context c* in which the woman o is agent (and, if we like, a is addressee), the truth of (23a) in c guarantees the truth of

(24)
The person watching me believes I am in danger

in c*. Consequently, if we switch back to the context c,

(25)
The person watching you believes you are in danger is true.

But there is a true identity in c which entails the falsity of (23b), namely,

(26)
I am the person watching you.

By =E, we have the anti-Fregean conclusion

(27)
I believe you are in danger

now seen to be true in c after all.

By Russellian lights, the reasoning is impeccable. But should it move the Fregean? For the Fregean, attitude ascriptions can be ambiguous between de re and de dicto construals, and this applies to (27) in particular. Does the derivability of (27) really show that in c the protagonist a can express a belief of his by asserting “you are in danger” into his phone, using “you” with the intention to refer to the woman he is talking to? Perhaps all that the derivation establishes is the truth of the de re reading of (27), “you are someone I believe to be in danger.” Note that to say that (27)’s de re reading is true in c is not to say that the agent of c believes that it is true, so it still does not give a grounds to say “you are in danger” into his phone.

(23a) can be understood de re as “she is someone I believe to be in danger,” and if the argument is construed de re throughout, the reasoning is correct. But of course the de re conclusion is not a problem for the Fregean. A de dicto conclusion might well be problematic, but to get one we must at least start with the reading of the premise (23a) on which it is a true de dicto self-ascription. Then, if the de re but not the de dicto reading of (27) is true, there must be some step in which there is a de dicto to de re switch. The switch appears to occur in moving from (23a) to (24).

(24) is relevantly similar to an ascription of Perry’s, “the store employee knows that I made the mess.” Here Perry is not ascribing knowledge of the proposition that is his “I made the mess”-labeled way of thinking of the state of affairs that Perry made the mess. By the same token, we should not construe (24) as o’s making an ascription to a of belief in the proposition that o expresses by “I am in danger.” For that way of thinking of the state of affairs that o is in danger is simply unavailable to a, since it involves o’s first-person way of thinking of herself. The truth of (24), then, is no more than the truth of “I am someone who the man watching me believes is in danger,” whose truth in c* is a consequence of (23a)’s truth in c. Thus, the de re conclusion follows from the de dicto starting point, but, to repeat, the de re conclusion is acceptable to the Fregean, since it is silent on what way of thinking the man watching o employs in his “she is in danger” thought.

Richard considers this kind of response (1990:128–32; see also 190–6 for his own critique of his earlier argument) and rejects it. This is partly because he thinks the response imputes opacity to subject-position in ascriptions, and partly because he is generally skeptical about the de re/de dicto distinction. But the above criticism does not seem to involve any opacity in subject-position, that is, a failure of =E when applied to ascriber, for the use of (26) is legitimate, there is no single context in which (23a)’s “I” and (24)’s “the man watching me” are coreferential, and the content-sentence is different in (23a) and (24). Certainly, the reference of “I” in c is the same as the reference of “the man watching me” in c*, but this does not threaten the use of =E if the content-sentence is fixed and interpreted uniformly, in Fine’s sense: “the man who is agent of c believes she is in danger” and “the man who is watching the agent of c* believes she is in danger” have the same truth-value if “she” is unequivocal, and in the second ascription, “she” is not anaphoric upon the embedded “the agent of c*.”

As for general skepticism about de re/de dicto, the reader may refer to the discussion in section 2. Relevant examples arise in extensions of Richard’s case, where the apparent truth of certain statements is easily explained using the distinction, but not without. Suppose that the suspiciously behaving man turns out to be a harmless drunk who staggers on by. The phone conversation then continues in such a way that a soon realizes that the woman he is talking to is the woman he was watching. a may then say such things to o over the phone as “so it was you I thought was in danger” or “I thought you were in danger but didn’t say anything because I didn’t realize it was you I was watching.” These are perfectly natural remarks and seem to be true along with (23b). Employment of the de re/de dicto distinction provides a straightforward explanation of how they can all be true together. So there is no need to take on the obligation burdening the Russellian, of always having to explain away the appearance of truth.

6. References and Further Reading

  • Almog, Joseph, John Perry, and Howard Wettstein (eds.) 1989. Themes from Kaplan. Oxford University Press.
  • Almog, Joseph and Paolo Leonardi (eds.) 2009. The Philosophy of David Kaplan. Oxford University Press.
  • Anderson, C. Anthony. 1980. Some New Axioms for the Logic of Sense and Denotation: Alternative
  • (0). Noûs 14:217–234.
  • Anscombe, Elizabeth. 1974. The First Person. In Mind and Language: The Wolfson Lectures, edited by Samuel Guttenplan, 45–65. Oxford University Press.
  • Bach, Kent. 1997. Do Belief Reports Report Beliefs? Pacific Philosophical Quarterly 78:215–241.
  • Bealer, George. 1993. A Solution to Frege’s Puzzle. In Philosophical Perspectives 7: Language and Logic, edited by James Tomberlin, 17–60. Ridgeview.
  • Berto, Francesco. 2013. Impossible Worlds. In Stanford Encyclopedia of Philosophy, edited by Edward Zalta. https://plato.stanford.edu/
  • Bjerring, Jens Christian, and Mattias Skipper Rasmussen. 2018. Hyperintensional Semantics: A Fregean Approach. Forthcoming in Synthese.
  • Borg, Emma. 2002. Pointing at Jack, Talking about Jill: Understanding Deferred Uses of Demonstratives and Pronouns. Mind & Language 17:489–512.
  • Braun, David. 1998. Understanding Belief Reports. The Philosophical Review 107:555–595.
  • Braun, David. 2000. Russellianism and Psychological Generalizations. Noûs 34:203–236.
  • Braun, David, and Jennifer Saul. 2002. Simple Sentences, Substitution, and Mistaken Evaluations. Philosophical Studies 111:1–41.
  • Brogaard, Berit. 2008. Attitude Ascriptions: Do You Mind the Gap? Philosophy Compass, Epistemology 3:93–118.
  • Burge, Tyler. 1978. Belief and Synonymy. The Journal of Philosophy 75:119–138.
  • Burge, Tyler. 1979. Sinning against Frege. The Philosophical Review 88:398–432.
  • Castaneda, Hector-Neri. 1968. On the Logic of Attributions of Self-Knowledge to Others. The Journal of Philosophy 65:439–456.
  • Chalmers, David. 2002. On Sense and Intension. In Sense and Direct Reference, edited by Matthew Davidson, 605–651. McGraw Hill.
  • Chalmers, David. 2011. Propositions and Attitude Ascriptions: A Fregean Account. Noûs 45:595–639.
  • Church, Alonzo. 1950. On Carnap’s Analysis of Statements of Assertion and Belief. Analysis 10:97–99. Also in Linsky (ed.) 1971, 168–170.
  • Church, Alonzo. 1951. A Formulation of the Logic of Sense and Denotation. In Structure, Method and Meaning: Essays in Honor of Henry M. Sheffer, edited by P. Henle, H. M. Kallen and S. K. Langer, 3–24. Liberal Arts Press.
  • Corazza, Eros. 2010. From “Giorgione” Sentences to Simple Sentences. Journal of Pragmatics 42:544– 556.
  • Cresswell, Max. 1985. Structured Meanings. The MIT Press.
  • Crimmins, Mark. 1992. Talk About Belief. The MIT Press.
  • Crimmins, Mark, and John Perry. 1989. The Prince and the Phone Booth: Reporting Puzzling Beliefs. The Journal of Philosophy 86:685–711.
  • Davidson, Donald. 1969. On Saying That. In Davidson and Harman (eds.), 73–91.
  • Davidson, Donald, and Gilbert Harman (eds.) 1969. Words and Objections: Essays on the Work of W. V. Quine. Reidel.
  • Davies, Martin. 1981. Meaning, Quantification and Necessity. Routledge.
  • Davies, Martin, and Lloyd Humberstone. 1980. Two Notions of Necessity. Philosophical Studies 38: 1–30.
  • Dennett, Daniel. 1982. Beyond Belief. In Thought and Object, edited by Andrew Woodfield, 1–95. Oxford University Press.
  • Donnellan, Keith. 1966. Reference and Definite Descriptions. The Philosophical Review 75:281–304.
  • Donnellan, Keith. 1974. Speaking of Nothing. The Philosophical Review 83:3–30.
  • Dummett, Michael. 1973. Frege: Philosophy of Language. Duckworth.
  • Evans, Gareth. 1973. The Causal Theory of Names. Proceedings of the Aristotelian Society, Supplementary Volume 47:187–208. Also in Evans 1985, 1–24.
  • Evans, Gareth. 1981. Understanding Demonstratives. In Evans 1985, 291–321. Oxford University Press.
  • Evans, Gareth. 1985. Collected Papers, edited by Antonia Phillips. Oxford University Press.
  • Fara, Delia. 2001. Descriptions as Predicates. Philosophical Studies 102:1-42.
  • Field, Hartry. 1978. Mental Representation. Erkenntnis 13:9–53.
  • Fine, Kit. 1989. The Problem of De Re Modality. In Almog, Perry and Wettstein (eds.) 1989, 197–272.
  • Fine, Kit. 2007. Semantic Relationism. Blackwell.
  • Forbes, Graeme. 1987. Indexicals and Intensionality: A Fregean Perspective. The Philosophical Review 96:3–31.
  • Forbes, Graeme. 1990. The Indispensability of Sinn. The Philosophical Review 99:535–563.
  • Forbes, Graeme. 1996. Substitutivity and the Coherence of Quantifying In. The Philosophical Review 105:337–72.
  • Forbes, Graeme. 2006. Attitude Problems. Oxford University Press.
  • Forbes, Graeme. 2018. Content and Theme in Attitude Ascriptions. In Non-Propositional Intentionality, edited by Alex Grzankowski and Michelle Montague. Oxford University Press.
  • Fox, Chris, and Shalom Lappin. 2005. Foundations of Intensional Semantics. Basil Blackwell.
  • Frege, Gottlob. 1892. Uber Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik
  • 100:25–50. Translated as “On Sense and Reference,” in Translations from the Philosophical Writings of Gottlob Frege, edited by Peter Geach and Max Black, 1970. Basil Blackwell.
  • Frege, Gottlob, and Bertrand Russell, 1904. Selection from the Frege-Russell Correspondence. In Salmon, N. and Scott Soames (eds.), 56–57.
  • Frege, Gottlob. 1967. The Thought: A Logical Enquiry. In Philosophical Logic, edited by Peter Strawson, 17–38. Oxford University Press.
  • Geach, Peter. 1969. The Perils of Pauline. Review of Metaphysics 23:287–300.
  • Gluer, Kathrin, and Peter Pagin. 2006. Proper Names and Relational Modality. Linguistics and Philosophy 29:507–535.
  • Gluer, Kathrin, and Peter Pagin. 2012. General Terms and Relational Modality. Noûs 46:159–199.
  • Heim, Irene, and Angelika Kratzer. 1998. Semantics in Generative Grammar. Basil Blackwell.
  • Jago, Mark. 2009. Logical Information and Epistemic Space. Synthese 167:327–341.
  • Jespersen, Bjorn. 2003. Why the Tuple Theory of Structured Propositions Isn’t a Theory of Structured Propositions. Philosophia 31:171–183.
  • Kadmon, Nirit. 2001. Formal Pragmatics. Blackwell.
  • Kaplan, David. 1969. Quantifying In. In Davidson, D. and Harman G. (eds.), 206–242; also in Linsky, L. (ed.), 112–144. Page references here are to the Linsky reprint.
  • Kaplan, David. 1986. Opacity. In The Philosophy of W. V. Quine, edited by Lewis Edward Hahn and Paul Arthur Schilpp, 229–289. Open Court.
  • Kaplan, David. 1989. Demonstratives. In Joseph Almog, John Perry and Howard Wettstein (eds.) 1989, 481–563.
  • Kaplan, David. 1990. Words. Proceedings of the Aristotelian Society 64:93–117.
  • Kazmi, Ali Akhtar. 1987. Quantification and Opacity. Linguistics and Philosophy 10:77–100.
  • Kripke, Saul. 1972. Naming and Necessity. In Semantics of Natural Language, edited by Donald Davidson and Gilbert Harman, 252–355. Reidel. Republished with an introduction as Naming and Necessity by Saul Kripke, Harvard University Press 1980. Page references here are to the 1980 book.
  • Kripke, Saul. 1979. A Puzzle about Belief. In Meaning and Use, edited by Avishai Margalit, 239–283. Reidel. Also in Salmon and Soames 1988: 102–148, and in Kripke, S. (ed.) 2011, 125–161. Page references in this article are to the last of these.
  • Kripke, Saul. 2001. Frege’s Theory of Sense and Reference. In Kripke, S. (ed.) 2011, 254–291.
  • Kripke, Saul. 2008. Unrestricted Exportation and Some Morals for the Philosophy of Language. In Kripke, S. (ed.) 2011, 322–350.
  • Kripke, Saul. 2011. Philosophical Troubles. Oxford University Press.
  • Kvart, Igal. 1984. The Hesperus-Phosphorus Case. Theoria 50:1–35.
  • Lewis, David. 1979. Attitudes de dicto and de se. The Philosophical Review 88:513–543.
  • Linsky, Leonard. 1967. Referring. Humanities Press.
  • Linsky, Leonard, (ed.) 1971. Reference and Modality. Oxford University Press.
  • Larson, Richard K., and Peter Ludlow. 1993. Interpreted Logical Forms. Synthese 95:305–355.
  • Lawlor, Krista. 2005. Confused Thought and Modes of Presentation. The Philosophical Quarterly 55:21–36.
  • Loar, Brian. 1972. Reference and Propositional Attitudes. The Philosophical Review 81:43–62.
  • Ludlow, Peter. 1996. The Adicity of “Believes” and the Hidden Indexical Theory. Analysis 56:97–101.
  • MacFarlane, John. 2018. In What Sense (If Any) is Logic Normative for Thought? in translation, in Modern Logic: Its Subject Matter, Foundations and Prospects, edited by D. Zaitsev, 345–383. Forum.
  • Magidor, Ofra. 2015. The Myth of the De Se. In Philosophical Perspectives 29: Epistemology, edited by John Hawthorne and Jason Turner, 249–283. Wiley Blackwell.
  • Maier, Emar. 2015. Parasitic Attitudes. Linguistics and Philosophy 38:205–236.
  • Marcus, Ruth Barcan. 1961. Modalities and Intensional Languages. Synthese 13:303–322. Also in Marcus 1993, 3–35.
  • Marcus, Ruth Barcan. 1962. Interpreting Quantification. Inquiry 5:252–259.
  • Marcus, Ruth Barcan. 1975. Does the Principle of Substitutivity Rest on a Mistake? In The Logical Enterprise, edited by Alan Anderson, Ruth Barcan Marcus and Richard Martin, 31–38. Yale University Press. Also in Marcus 1993, 101–109.
  • Marcus, Ruth Barcan. 1993. Modalities. Oxford University Press.
  • Mates, Benson. 1952. Synonymity. In Semantics and the Philosophy of Language, edited by Leonard Linsky, 111-136. University of Illinois Press.
  • Millikan, Ruth. 2000. On Clear and Confused Ideas. Cambridge University Press.
  • Moltmann, Friederike. 2003. Propositional Attitudes without Propositions. Synthese 135:70–118.
  • Moltmann, Friederike. 2008. Intensional Verbs and Their Intentional Objects. Natural Language Semantics 16 (3):239–270.
  • Moltmann, Friederike. 2017. Cognitive Products and Semantics of Attitude Verbs and Deontic Modals. In Act-Based Conceptions of Propositional Content, edited by Friederike Moltmann and Mark Textor. Oxford University Press.
  • Moss, Sarah. 2018. Probabilistic Knowledge. Oxford University Press.
  • Muskens, Reinhardt. 2005. Sense and the Computation of Reference. Linguistics and Philosophy 28:473–504.
  • Pagin, Peter, and Dag Westerståhl. 2010. Pure Quotation and General Compositionality. Linguistics and Philosophy 33:381–415.
  • Parsons, Terence. 1981. Frege’s Hierarchies of Indirect Senses and the Paradox of Analysis. In Midwest Studies in Philosophy Vol. VI: The Foundations of Analytic Philosophy, edited by P. A. French, T. Uehling and H. Wettstein, 37–58. Minnesota University Press.
  • Parsons, Terence. 2009. Higher-Order Senses. In Almog, J., and Leonardi, P. (eds.), 45–59.
  • Partee, Barbara. 2003. Privative Adjectives: Subsective Plus Coercion. In Presuppositions and Discourse. Essays offered to Hans Kamp, edited by Rainer Bäuerle, Uwe Reyle and Thomas Ede Zimmerman. Elsevier.
  • Peacocke, Christopher. 1981. Demonstrative Thought and Psychological Explanation. Synthese 49:187–217.
  • Peacocke, Christopher. 1983. Sense and Content. Oxford University Press.
  • Peacocke, Christopher. 2008. Truly Understood. Oxford University Press.
  • Peacocke, Christopher. 2009. Frege’s Hierarchy: A Puzzle. In Almog, J., and Leonardi, P. (eds.), 159–186.
  • Percus, Orin, and Uli Sauerland. 2003. On the LFs of Attitude Reports. Proceedings of Sinn und Bedeutung 7:228–42.
  • Perry, John. 1977. Frege on Demonstratives. The Philosophical Review 86:474–497.
  • Perry, John. 1979. The Problem of the Essential Indexical. Noûs 13:3–31.
  • Pickel, Bryan, and Brian Rabern. 2017. Does Semantic Relationism Solve Frege’s Puzzle? The Journal of Philosophical Logic 46:97–118.
  • Predelli, Stefano. 2010. Substitutivity, Obstinacy, and the Case of Giorgione. The Journal of Philosophical Logic 39:5–21.
  • Quine, W. V. 1956. Quantifiers and Propositional Attitudes. The Philosophical Review 53:177–187. Also in The Ways of Paradox by W. V. Quine, 1966, 177–87. Harvard University Press.
  • Quine, W. V. 1961. Reference and Modality. In From a Logical Point of View by W.V. Quine, Harper and Row, 139–157. Also in Linsky, L. (ed.), 1971, 168–170. Page references here are to the Linsky reprint.
  • Recanati, François. 2000. Oratio Obliqua, Oratio Recta. The MIT Press.
  • Recanati, François. 2012. Mental Files. Oxford University Press.
  • Richard, Mark. 1983. Direct Reference and Ascriptions of Belief. The Journal of Philosophical Logic 12:425–452.
  • Richard, Mark. 1986. Quotation, Grammar and Opacity. Linguistics and Philosophy 9:383–403.
  • Richard, Mark. 1990. Propositional Attitudes. Cambridge University Press.
  • Rupert, Robert. 2008. Frege’s Puzzle and Frege Cases: Defending a Quasi-Syntactic Solution. Cognitive Systems Research 9:76–91.
  • Russell, Bertrand. 1905. On Denoting. Mind 14:479–493. Also in Logic and Knowledge by Bertrand Russell, edited by R. C. Marsh, 41–56. Allen & Unwin, 1956.
  • Saka, Paul. 2006. The Demonstrative and Identity Theories of Quotation. The Journal of Philosophy 103:452–471.
  • Saka, Paul. 2018. Superman Semantics. In Advances in Pragmatics and Philosophy II, edited by Alessandro Capone et al., 141–157. Springer.
  • Salmon, Nathan. 1981. Reference and Essence. Princeton University Press.
  • Salmon, Nathan. 1986. Frege’s Puzzle. The MIT Press.
  • Salmon, Nathan. 1990. A Millian Heir Rejects the Wages of Sinn. In Propositional Attitudes, edited by C. Anthony Anderson and Joseph Owens, 215–247. CSLI Publications.
  • Salmon, N. and Scott Soames (eds.) 1988. Propositions and Attitudes. Oxford University Press.
  • Saul, Jennifer. 1997a. Substitution and Simple Sentences. Analysis 57:102–108.
  • Saul, Jennifer. 1997b. Reply to Forbes. Analysis 57:114–118.
  • Saul, Jennifer. 1999. Substitution, Simple Sentences, and Sex-scandals. Analysis 59:106–112.
  • Saul, Jennifer. 2007. Simple Sentences, Substitution, and Intuitions. Oxford University Press.
  • Schiffer, Stephen. 1979. Naming and Knowing. In Contemporary Perspectives in the Philosophy of
  • Language, edited by P. A. French, T. Uehling and H. Wettstein, 61–74. University of Minnesota Press.
  • Schiffer, Stephen. 1992. Belief Ascription. The Journal of Philosophy 89:499–521.
  • Schiffer, Stephen. 1996. The Hidden-Indexical Theory’s Logical-Form Problem: A Rejoinder. Analysis 56:92–97.
  • Schweizer, Paul. 1993. Quantified Quinean S5. The Journal of Philosophical Logic 22:589–605.
  • Segal, Gabriel. 1989. A Preference for Sense and Reference. The Journal of Philosophy 86:73–89.
  • Sleigh, R. C. 1968. On a Proposed System of Epistemic Logic. Noûs 2:391–398.
  • Smullyan, Arthur. 1948. Modality and Description. The Journal of Symbolic Logic 13:31–37. Also in Linsky 1971, 35–43.
  • Soames, Scott. 1987. Direct Reference, Propositional Attitudes, and Semantic Content. Philosophical Topics 15:47–87.
  • Soames, Scott. 2002. Beyond Rigidity. Oxford University Press.
  • Sosa, Ernest. 1970. Propositional Attitudes De Dicto and De Re. The Journal of Philosophy 67:883–896.
  • Staffel, Julia. 2018. Attitudes in Active Reasoning. In Reasoning. New Essays on Theoretical and Practical Thinking, edited by Magdalena Balcerak Jackson and Brendan Balcerak Jackson. Oxford University Press.
  • Stalnaker, Robert. 1984. Inquiry. The MIT Press.
  • Stalnaker, Robert. 1987. Semantics for Belief. Philosophical Topics. Also in Content and Context by Robert Stalnaker, 117–129. Oxford University Press, 1999.
  • Taschek, William. 1988. Would a Fregean Be Puzzled by Pierre? Mind 97:99–104.
  • Taylor, Kenneth. 2002. De Re and De Dicto: Against the Conventional Wisdom. Philosophical Perspectives 16:225–265.
  • Thomason, Richmond. 1980. A Model Theory for Propositional Attitudes. Linguistics and Philosophy 4:47–70.
  • Tichy, Pavel. 2004. The Myth of Non-Rigid Designators. In Pavel Tichy’s Collected Papers in Logic and Philosophy, edited by Vladimir Svoboda, Bjorn Jespersen and Colin Cheyne.
  • Washington, Corey. 1992. The Identity Theory of Quotation. The Journal of Philosophy 89:582–605.
  • Yalcin, Seth. 2015. Quantifying in from a Fregean Perspective. The Philosophical Review 124:207–253.
  • Zalta, Edward. 2001. Fregean Senses, Modes of Presentation, and Concepts. In Philosophical Perspectives 15: Metaphysics, edited by James Tomberlin, 335–359.

Author Information

Graeme Forbes
Email: graeme.forbes@colorado.edu
University of Colorado
U. S. A.

Sayyid Qutb (1906—1966)

Sayyid Qutb was one of the leading Islamist ideological thinkers of the twentieth century. Living and working in Egypt, he turned to Islamism in his early forties after about two decades as a secular educator and literary writer. As an Islamist, he held that all aspects of society should be conducted according to the Shari’a, that is, laws of God as derived from the Qur’an and the practice (sunna) of the Prophet Muhammad. Probably his best known and most distinctive doctrine is his interpretation of jahiliyya (pre-Islamic ignorance) as characterizing all of the societies of his time, including the Muslim ones. Another doctrine was his interpretation of faith in one God only (tawhid) as entailing the absolute sovereignty of God (hakimiyyat Allah) and the liberation of humans from service to other humans instead of God. He was executed by the Egyptian government for his Islamist activities and is thus considered a martyr, something that has added immeasurably to the impact of his ideas.

Although he did not consider himself a philosopher, he had opinions on a number of topics that interest philosophers, and he commented on the ideas of philosophers. He had a grand vision of the universe as a harmonious whole under God’s rule and of humans as called upon to be God’s deputies in managing the Earth. Humans, however, were given a measure of freedom that other beings do not have. Rightly used, this freedom would allow humans to fit in harmoniously with the rest of creation and have the highest status under God. Misused, it would introduce discord into the world and misery into human life. Jahiliyya equates to misuse of this freedom, and Qutb calls for jihad, conceived along the lines of revolution, as the response. In discussing these things, he touches on a range of topics, including the nature of God and the universe, human nature, knowledge and revelation, ethics, society, human history, death, and judgment. This article presents only the latest and most radical phase of his thought.

  1. Biography
  2. Basic Conception
  3. God
  4. Human Nature and Purpose, Other Spiritual Beings
  5. Free Will and Predetermination, The Problem of Evil
  6. Knowledge: Revelation, Worldly Knowledge
  7. Ethical Values, Shari’a
  8. The Ideal Society (Utopia), Economics. Gender Relations
  9. Jahiliyya (Dystopia) and Jihad (Revolution)
  10. Human History
  11. Death, Judgment, Martyrdom
  12. Qutb’s Legacy
  13. Final Remarks: Aesthetics, Harmony, and Essentialism
  14. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Sayyid Qutb (1906—1966) was and is one of the most important ideologues of the Islamist movement, which seeks to re-establish truly Islamic values and practices in Muslim societies that have become more or less Westernized. He was born and raised in an Egyptian village, attended the state primary school there, and in 1920 moved to Cairo to attend secondary school and then Dar al-‘Ulum, a teacher training institute that sought to balance traditional and modern ways. From 1933 to 1952 he worked in the Ministry of Education, first as a teacher and later as an inspector and administrator. He also became one of the secular literary elite prominent at the time, publishing more than 100 poems as well as articles and books on literary and social topics. In 1948, he rather abruptly began to publish Islamist articles and the next year published a major Islamist book, Social Justice in Islam, which was to go through a total of six editions. The reasons for this shift are not totally clear, but the chaos of Egyptian politics, the efforts of imperialist powers to reassert their position, and the establishment of the state of Israel presumably played a role. His Islamism was confirmed during a two-year (1948-1950) study tour of the United States, which he found to be technologically impressive but hopelessly corrupt morally.

After his return to Egypt he joined the Muslim Brothers, the leading Islamist organization, founded in 1928 by Hasan al-Banna, and soon became one of its leading spokespersons. The Brothers supported the Free Officers’ revolution in 1952 at first but soon withdrew support. After an attempt on the life of Abdel Nasser in 1954, the leading Brothers were imprisoned, Sayyid Qutb among them. In prison, they suffered very harsh treatment, though poor health spared Qutb the worst of it. This led to a radicalization of his ideas, including the claim that the whole world, including the “Muslim” world, is in a state of jahiliyya, that is, un-Islamic ignorance and barbarism. This radicalization was assisted by the ideas of the extremely influential Indo-Pakistani Islamist Abu’l ‘Ala’ Mawdudi (1903-1979), whose writings became known to Qutb and other Arab thinkers from about 1951. Mawdudi’s ideas about divine sovereignty, the Islamic state, jahiliyya, and other things spoke very much to Qutb’s condition and helped him to crystalize and articulate his views.

In 1964, Qutb was released from prison and published his best-known book, Milestones, effectively calling for an Islamic revolution. He also became mentor to a group of young Brothers and was soon arrested for conspiring to overthrow the government. In 1966, he was convicted of this charge and executed. He thus became a martyr to his cause, considerably multiplying his influence.

Qutb wrote a number of books during his Islamist period in addition to those mentioned, especially a multi-volume commentary on the Qur’an, In the Shadow of the Qur’an, which he began in 1952 and was still revising at the time of his death.

Qutb’s radical ideas divided the Muslim Brothers after his death. The main line group rejected them and sought to work within the existing political system, briefly achieving the presidency in 2012-2013. Smaller groups, such as the so-called Takfir wa-Hijra group, Jama‘at al-Islamiyya (Islamic group), and Tanzim al-Jihad (Jihad organization), adopted and modified Qutb’s ideas and were responsible for considerable terrorism through the 1990s (see below). His influence spread far beyond Egypt, indeed throughout the whole of the Islamic world and its diaspora. This included extreme groups such al-Qaeda, whose second leader, Ayman al-Zawahiri, was very much influenced by Qutb’s main ideas and his example as a martyr, and who first joined an Islamist group the year that Qutb was executed. In fact, Qutb has come to be seen by many as the spiritual “godfather” of such groups. On the other hand, it is possible to read him selectively, and so he has influenced many who do not fully accept his extreme views. There is a considerable literature on him both in Islamic and Western languages.

Qutb was not a philosopher by most definitions of the term, and he consciously rejected philosophy as he understood it, both Western philosophy and classical Islamic philosophy. He considered the discipline to be an effort to accomplish with human reason what can only be accomplished on the basis of divine revelation and also as a foreign intrusion on pure Islamic thought. Nevertheless, his thinking was quite systematic and did have a place for reason; moreover, he used rational arguments in criticizing philosophy and made reference to Western philosophers (mostly known to him through Arabic translations) in the process. He also deals with many topics that are of interest to philosophers. He is a good example of Weber’s Wertrationalität (rationality in accordance with moral demands).

The following article is based entirely on the last phase of his writing, from about 1958, during which he rejected many of his earlier ideas. This phase was the most radical, most systematic, and most influential.

2. Basic Conception

Qutb saw his ideas as a necessary interpretation and corollary of the basic Muslim creed: “There is no god but God; Muhammad is the Messenger of God.” His views fall within the wide spectrum of Sunni Islamic thinking but particularly within the forms of it commonly labelled “Islamist” (stressing the application of Islamic norms to society) and “Salafi” (broadly, those who emphasize the authority of the Qur’an, Sunna, and the earliest generations of successors, the salaf, over against later “innovations”). Like many popular writers on religious topics in modern times, he did not have the traditional education given to the ‘ulama’ (religious scholars) and was to some extent self-taught in this area.

The article focuses primarily on the more basic and theoretical aspects of Qutb’s writing (what we might call his philosophy or theology), which he calls the Islamic tasawwur, a word usually translated “concept” or “conception,” but which here could also be translated “worldview” or “vision.” Qutb, in the manner of fundamentalists and also scientists, does not consider this his conception but the true conception. He characterizes this conception as divinely sourced, and following from that: fixed in its basics, comprehensive, balanced, dynamically positive, realistic, and unified.

The tasawwur grows out of its divine source and does not need or accept significant influence from the outside. Therefore, Qutb criticizes not only contemporary modernists, who wish to “reform” Islam in terms of modern, that is, Western ideas and ideologies, but also the earlier Muslim philosophers and theologians, who made use of Greek philosophical ideas. We may note that Qutb is firmly of the view that ideas are prior to actions, which flow from them. The ideas are not ends in themselves, however, but are meant to undergird actions and activities. In fact, all of human life and activity flows from a creedal tasawwur of some kind. Qutb often describes Islam (and religion more generally) in terms of three stages: tasawwur, manhaj (method, program), nizam (social and political order). Each stage proceeds from the former one with almost logical necessity. All three are necessary for Islam to exist. Since Qutb believed that there was no Islamic nizam in his time, he often said that Islam has no “existence.” We may note that Qutb’s Islam is a highly reified concept, not just a label applied variously to diverse human ideas and practices.

3. God

The centrepiece of the tasawwur is God (Allah), that awesome being Whose essence and some of Whose attributes are beyond the reach of human understanding, though many attributes can be understood by the human mind. (Qutb does not discuss the relation between God’s essence and attributes, an important theme in traditional Muslim theology.) These attributes belong only to God and comprise his divinity; no other being shares in them. God is one and unique. This is the first and most basic constituent of the tasawwur, and recognition of it is called tawhid (the usual Arabic term for belief in one God). God is also eternal, without beginning or end.

This God is the creator and source of everything else in existence. These things are separate from God but totally dependent on Him and harmoniously obey regular laws, some of which can be and have been discovered by human science. These laws are not separate from God, however. God acts directly in all that happens, so that these “laws” are just His customary way of acting. Since His will is completely free, He can and sometimes does vary His action and produce what we call miracles. For example, fire usually burns things, but God might make it not do so on some occasion, as in the story of the prophet Ibrahim (Abraham) in the Qur’an. Such events do not disrupt the general order and harmony of the universe, however, since they are part of God’s larger plan. While most of creation obeys God necessarily, humans in their moral aspect may or may not obey. Instead, they are subject to a moral law established by God, the Shari’a, which will put them in harmony with creation if they obey it.

God is therefore the Lord and Sustainer of all creation, while all creation stands in a relation of servanthood to Him, necessarily in the case of most things, willingly or unwillingly in the case of humans (disobedient humans are still servants). It follows necessarily from all of these attributes that God is the only source of authority and the only sovereign in the universe, not only physically but also morally, legally, and politically. No human ruler or nation may claim sovereignty, a point of major importance for Qutb’s revolutionary doctrine. These central ideas reflect those of Mawdudi, though Qutb probably stresses them more. His term for the sovereignty of God, hakimiyyat Allah, comes from the Arabic translation of Mawdudi’s term for the same thing.

4. Human Nature and Purpose, Other Spiritual Beings

Humans hold a very special place in God’s creation, as already indicated. According to the Qur’an, God created the human body and breathed His spirit into it, and He gave humans a status above the angels, whom he commanded to prostrate to the first man. Human nature as originally created, and in its proper state, is called fitra, and this fitra has a need for God and a predisposition to serve Him. The Islamic tasawwur is congruent with it. The fitra may be obscured by human whims, desires and negligence, but is not destroyed.

The basic purpose of humans is to serve God willingly in all aspects of life. They are to do so in the honorable role of God’s deputy, khalifa, over the earth. They are responsible for making it fruitful, developing it technologically, caring for it, and organizing a just society in accordance with God’s Shari’a. This idea is very important to Qutb.

The only significant distinctions among humans in God’s sight are based on their obedience or disobedience to His will. Otherwise all are of equal value regardless of race, ethnicity, nationality, class, or gender, although in the last case there are significant differences of function to be discussed below.

Angels are spiritual beings who serve God and are always obedient to Him. They carry God’s throne, deliver God’s messages to the prophets, watch over the gates of paradise and hell, record the actions of humans, support them in their struggle against evil, pray for them, and cause them to die when their time comes. Jinn (the “genies” of the Arabian Nights) are made of fire, can live on the face of the earth or inside it, can move very swiftly, and are invisible, though they may become visible to humans. They have the power of moral choice and are commanded to serve God just as humans are. Some are believers, and some are not. They will be resurrected on the last day and go to paradise or hell. The Devil is a jinn. Satans may be humans or jinn; they tempt human beings and are enemies to prophets. We know about all of these because the Qur’an tells us. Human science knows nothing of them, though it may discover something about them some day. Awareness of these creatures expands our world beyond the limited one of physical perception.

5. Free Will and Predetermination, The Problem of Evil

But are humans really free in their moral choices, given that God is directly involved in determining everything that happens? Like earlier Muslim theologians, Qutb seeks to affirm both (this is one of the ways the Islamic tasawwur is balanced). He states that the human will works within the bounds of divine determination and that this divine determination is realized through human will. The precise relationship between them is one of those things that are beyond the capacity of human reason to comprehend. Some degree of human freedom is necessary for moral responsibility and for the activist position that Qutb took, while certainty that God is in control is important for the small, struggling revolutionary movement of which he was a part.

But why does evil exist at all and why do good people suffer? From time to time Qutb suggests various partial answers to the latter question. People suffer because they violate the physical or moral laws, or God causes them to suffer to teach them or to provide challenges. This world is a place of trial and striving, and the suffering of a good person will be compensated in the future life, and possibly also in this life. As to why God did not create a world without suffering and evil, this question is not raised by sincere believers, who respect God too much and know that the issue is beyond the capacity of the human intellect to deal with, nor is it raised by serious atheists since they do not believe in God. It is raised by those who are argumentative or not serious.

6. Knowledge: Revelation, Worldly Knowledge

How do humans know of God and of the truths enshrined in the Islamic tasawwur? The human fitra can perceive something about God in the harmony of the universe that He has created and runs (that is, the Teleological Argument), but of primary importance is God’s word revealed to messengers to whom He has given a special nature that allows them to receive His messages and particularly that given to the Prophet Muhammad in the Qur’an. The text of the Qur’an contains the verbatim words of God and provides information about God, the universe, aspects of human, divine moral and legal commands, and the final judgment of human by God. It calls on humans to reflect on the signs of God in the harmony of the universe. It is from the Qur’an that the Islamic tasawwur is directly and exclusively derived.

The Qur’an speaks to all aspects of the human fitra, not only to reason but also to the emotions and the aesthetic sense. According to Qutb and most Muslims, it has the power to influence people directly through these. Qutb gives examples of this, including one in which a woman was converted to Islam by hearing the recitation of the Qur’an. In the years before he embraced Islamism, Qutb wrote two books exploring the literary nature of the Qur’an (Artistic Depiction in the Qur’an and Scenes of the Resurrection in the Qur’an) and concluded that its power comes from producing extremely evocative word pictures for the reader. He appears to have continued to hold this theory in his Islamist period though not limiting the power of the Qur’an to it.

Qutb generally insists on interpreting the text in terms of its plain meaning, but in the case of realities that are beyond human comprehension he understands it to provide allusions that inspire the human soul. These realities include the divine essence, the connection between will of creator and creation, and the nature of the spirit. For the rest, reason can receive the revelation and interpret it, along with other faculties. On the whole, Qutb avoids metaphorical or esoteric interpretations of the Qur’an.

One should seek and may derive direct inspiration from the Qur’an, especially if one has a close and ongoing relation to it. Qutb claims to have lived for years “in the shadow of the Qur’an” (this is also the title of his Qur’an commentary). Especially important is the intention to act on what one reads. One is not to read the Qur’an simply as a devotional exercise, or to get information, but to find out what God wants one to do at a particular time and to do it. Qutb is convinced that the Qur’an will guide such a person. (This is part of what is meant by saying that the tasawwur is practical). One will not truly understand the Qur’an unless one is engaged in the struggle (jihad) for an Islamic society.

For most Muslims, the Sunna (words and deeds) of the Prophet Muhammad is authoritative along with the Qur’an; and also authoritative is the tradition of scholarship related to these. Qutb likewise relies on the Sunna and, somewhat selectively, on the later tradition. He emphasizes the Qur’an, however, more than most. He also emphasizes the generation of Muslims contemporary with Muhammad, the “Unique Qur’anic Generation” as he terms them. This generation was present at the time of revelation and drew their understanding of life and their duties exclusively from it; they received it with the intention to obey as a soldier would receive marching orders for the day; also, they broke completely with their former life. No later generation has equalled them, but they should be the model for Islamic activists today.

Still, there are many areas of life in which human reason is sufficient for understanding and making discoveries, and in so doing fulfilling part of the human role as God’s khalifa. These involve what Qutb calls the “pure” sciences, mainly the physical sciences insofar as they do not involve moral or metaphysical issues.

Splitting the atom would be included but not its use in atomic bombs. Biology is included but not Darwinian evolution. The Islamic tasawwur encourages this kind of science. It does not have the certainty of revelation but, properly done, it will not conflict with revelation. Qutb speaks of the “open book of the universe” (possibly echoing the 19th century Indian modernist, Sayyid Ahmad Khan). In fact, Western science is historically rooted in the past scientific activities of Muslims. It has developed in an anti-religious direction, but Islam can purify this science and put it on the sound basis of the fitra.

7. Ethical Values, Shari’a

General ethical values are of course part of the Islamic tasawwur. They are fixed and do not “develop” over time, although their application may vary. They provide a “fixed axis” and “fixed framework” around and within which human activity takes place. These values are not scattered or ad hoc but are systematic, constituting a complete system for all of life. As they derive from the one God, they unify humans with the creation and its Creator, and integrate individual personalities. To be valid, ethical action must be accompanied by faith in this God. Because they come from God, they provide a greater sense of obligation than secular morality can. Qutb criticizes various forms of secular morality at length.

In principle, there is no grey area in Qutb’s ethics. The contrast is stark between guidance and error, faith and kufr (unbelief, wilful rejection of faith), tawhid (recognition of God’s unity) and shirk (ascribing divinity to other beings than God). Along with this, however, he recognized that although basic ethical values do not change, their application does change with changing times and situations, both of which are experienced very much by modern revolutionaries.

The specific ethical rules and values are enshrined in the Shari’a, to which Qutb makes very frequent reference. This is commonly called the law of God but is more accurately described as a moral classification by God of all human actions into five categories: obligatory, approved, neutral, reprehensible or forbidden. The human understanding of the Shari’a is called fiqh (“understanding”) and is based on the Qur’an and the Sunna of the Prophet, along with the effort (ijtihad) of later scholars to interpret and apply these. Among Sunnis, the consensus of these scholars on any ruling has been considered to guarantee its validity, with the result that the scope for ijtihad has diminished over time. One of the major issues of modern times has been the degree of freedom contemporary interpreters should have to reverse past rulings in the light of current needs. Modernists seek a high degree of freedom in order to bring fiqh in line with prevailing values derived from the West. Qutb opposes ijtihad for this purpose, which he considers defeatism in the face of the West, and insists that there should be no ijtihad where there is a clear and authoritative text. He favors it, however, where, in his view, it represents an authentic Islamic response to current conditions. He calls this fiqh haraki (that is, a fiqh that reflects changing human activities or needs of the current Islamic movement). He also indicates approval of the unfettered use of the principle of public interest (maslaha), a principle recognized in traditional fiqh but usually with restrictions. At the same time, he regularly canvasses the views of earlier scholars on specific matters and sometimes accepts them. All of this accords with his claim that the Islamic tasawwur is realistic and practical. The term Shari’a is to some extent interchangeable or correlated with the term manhaj, and he seems to see the Shari’a as part of the Islamic manhaj. Qutb also claims that the Shari’a is perfectly harmonious with the general laws of the universe, including the physical laws of human biology, and is the only means by which the voluntary life of humans can be integrated with them, as briefly mentioned above.

8. The Ideal Society (Utopia), Economics. Gender Relations

The ideal society is one that recognizes the sovereignty of God alone, not the people, the nation, or the human ruler, and is governed by the Islamic Shari’a. Since the Shari’a is part of God’s overall law for the universe, a society truly governed by it will be in accord with the whole of the universe and with the human nature and needs of its members. It will be just, progressive, and tolerant. Class, racial, and ethnic distinctions will not influence people’s status, but rather piety, virtue, and competence. It will be a society in which people generally know who the virtuous and competent are and can choose them for leadership. He backs this up with descriptions of the society governed by the prophet Muhammad and his earliest successors, especially in Social Justice in Islam. Though the historical critic would probably claim that he is selective in his examples, Qutb’s view is that the history of Islam is not identical to the whole history of those societies called Muslim, but to the history of those societies insofar as they were truly following the Shari’a and implementing Islam.

While class, racial, and ethnic differences will not matter, religious differences will matter since the society is based on a religious creed. Qutb sometimes states that people have absolute freedom of conscience in matters of belief and that the freedom of any individual to hold and propagate his religious belief, free of compulsion, is a fundamental human right. It is not clear just how far this goes, however. No one should be forcibly converted to Islam. Jews and Christians (and possibly others) will have a place in society as granted by the Qur’an and Sunna. They may follow their own creeds and rites of worship but are limited in some areas, as specified in the traditional idea of dhimma (protected status), which Qutb generally accepts and defends. For example, they will pay a special tax called jizya, for which Qutb gives three reasons: it is a symbol of their acceptance of Islamic rule, it is in return for their protection by the Islamic government, and it contributes to the social expenses of the state. While dhimmis would be granted freedom of belief and worship, and Qutb speaks of freedom to propagate religious belief, it seems unlikely that a state run on Qutb’s interpretation would allow non-Islamic religious views to be propagated freely, among Muslims or anti-religious views at all. This is especially the case given Qutb’s view that Islam alone is the true religion and his statement in at least one place that abandoning the truth is corruption. Such a state would hardly accept the kind of religious pluralism, the legal equality in principle of all religions, assumed by many Westerners and others.

An Islamic government will be governed by the principle of consultation (shura). Qutb gives many examples of it from the early days of Islam. The exact form of shura varies with circumstances and, in accordance with the realistic and practical nature of the Islamic tasawwur, will be determined only when such a government is actually formed. Nevertheless, in a least one place he does outline a structure of government involving a ruler (imam) nominated by the recognized leaders of the community (literally: “people of binding and loosing”, a recognized phrase in Arabic) and chosen by the whole community. There will also be a parliament (majlis al-shura) whose members are chosen by the people locally. The high moral tone the government is more important, however, than these details. Qutb seems to envisage the imam as a strong and righteous leader who is normally to be obeyed implicitly, but not if he commands people to disobey God. He rejects the term “democracy” because he sees it as a Western concept involving government by the people instead of by God.

For all that Qutb seems to envisage the true Islamic state and society as a kind of utopia, he recognizes that actual Islamic societies have been less than ideal, and he severely criticizes many of the historical Muslim rulers without quite calling their government and society un-Islamic. In at least one place he states a ruler may be unjust but still be considered Islamic if he basically recognizes the authority of God.

Economics in an Islamic society is based on the fact that all wealth belongs to God, who entrusts it to human societies and thence to individuals as his khalifas. On this basis, the right to private property is guaranteed as a reward for work so that individuals are encouraged to work for their own benefit and the benefit of all. This strikes a just balance between effort and reward and accords with human nature. Private property, however, is limited legally by the institution of Zakat, which requires a portion of one’s wealth to be given away and is one of the Pillars of Islam. It is also limited by the right of the political leader to tax further when this is necessary for the welfare of the community and to assist the needy, who have a recognized right to a share in the community’s wealth. Islam also opposes the concentration of wealth in a few hands, and its rules on inheritance and opposition to usury are designed to discourage this. Likewise, the community should own collectively resources needed for the general wellbeing, and these have expanded considerably in modern times. Added to all of this is the additional moral obligation on individuals to assist the needy and contribute to social causes. In discussing economics, Qutb often goes beyond what the traditional sources of authority prescribe, especially in relation to the economic power of the state. What he writes would be largely acceptable to modernists with a moderate socialist inclination.

Qutb is at pains to point out that women and men are equal in respect of their humanity as such. He even argues that Eve was not created from Adam’s rib but created in the same way as Adam (the account of Adam’s rib is not in the Qur’an but is in later sources). In temperament, however, women and men differ. Women are more emotional and men more rational. Women’s temperament fits them for raising children and other domestic tasks, whereas men are more fitted for the world of work outside the home. Hence, men have the right to leadership within the family and women the right to protection.

The family is the basic unit of society and the institution that produces human values; its place is rooted in the cosmic order. Obedience to God in matters relating to marriage, divorce, and family is service to God no less than formal prayer. Thus, women’s primary role of caring for the family is extremely important. For this reason, women should not work outside the home unless it is absolutely necessary. Moreover, those who do are likely to be exploited both sexually and economically, turned into sex objects and underpaid. He also believes that young children should be cared for within the home, not in crèches. He draws on his experiences in the United States, among other things, to support these points. All of these things characterize a jahili society, according to him. He also argued that Western women sought election to parliament because men had been making laws unfair to women, but under a system of divinely based law the laws will be fair.

Women should dress in a manner that shows only their faces and hands but not be secluded, as in some societies. They also should not mix publicly with men as this may lead to promiscuity and weaken marriages. He defends divorce and polygyny, at least under certain conditions. If these seem to make women insecure it is because the present society is jahili and not sufficiently attuned to Islamic values. Although Muslim men are permitted in traditional fiqh to marry Jewish or Christian women, Qutb is inclined to oppose this today since it may weaken Muslims’ faith and sense of identity, given that current Muslim societies are only nominally Muslim. It is worth noting that Qutb evidently had no objection to women’s involvement in the Islamic movement. Both of his sisters were involved, and one went to prison. He was also a mentor to Zaynab al-Ghazali, a well-known woman Islamic activist in Egypt who had put into her marriage contract that her husband would not interfere with her Islamist activities.

9. Jahiliyya (Dystopia) and Jihad (Revolution)

Any society that is not governed according to the Shari’a is a jahili society. The term jahiliyya literally means ignorance with a connotation of barbarism and has most often been applied to the Arabian society on the eve of Muhammad’s mission. The term and general idea come from Mawdudi, but Qutb makes it more extreme. For Mawdudi, contemporary Muslim societies are part Muslim and part jahili, while for Qutb there is no such mid-term. The contrast is stark: a society is either Islamic or jahili. A jahili society compels or at least pressures its member to serve other humans rather than God, and its leaders presume to create values and laws rather than apply the values and laws of God, effectively claiming divine attributes and making themselves gods beside God. The moral, psychological, and social results are disastrous, though it is not these results these results that define a jahili society. Many states claim to be Islamic and claim that their laws are based on the Shari’a or partly so when in reality the laws are man-made and they are jahili societies. In fact, Qutb claimed that all so-called Islamic countries in his time were jahili, with the result that, as he put it, Islam does not exist. This does not mean that there are no Muslims, but it does mean that they cannot live a complete Muslim life. While Qutb labels societies jahili he is much less inclined to label individuals as unbelievers (kafir), unlike some of his Qutbist successors.

Although the line between Islam and jahiliyya is stark in principle, Qutb does not clearly indicate exactly how and where it is drawn. It seems that societies whose leaders sincerely recognize the Shari’a even if they often fall short in practice will still be Islamic, while others that appear morally superior but whose leaders do not accept the Shari’a, or who interpret it in a Westernizing way, will be jahili, though Qutb will assume that the moral difference is temporary or more apparent than real. This is consistent with Qutb’s views, mentioned above, that ideas are primary and that faith is necessary for works to be valid.

The answer to jahiliyya for Qutb is jihad. This word, which appears frequently in the Qur’an and the later tradition, means “striving” and the full phrase is “striving in the path of God”. It may take non-violent forms, such as the “greater jihad”, the struggle against evil tendencies within one’s self (referred to by the prophet), or other forms of righteous striving. In juristic and political circles, the term has mainly referred to the violent activity of war, with rules for proper behavior in warfare elaborated. Thus, the term is often translated “holy war”. This is the usage that Qutb draws on. In modern times, many Muslims have preferred to emphasize the non-violent forms of jihad and to limit violent jihad to defensive warfare. Qutb considers this defeatist and argues the need for both violence and the initiating of violence at times. Jahiliyya is not merely a condition of society but an aggressive and unrelenting force that can only finally be defeated by violence. Moreover, Muslims have an obligation not only to defend themselves but to fight tyranny wherever it appears and to remove obstacles to the preaching of Islam.  Jihad is part of the Islamic mission to liberate humans from servitude to other humans and realize the rule of God on earth. This is the greatest of all human tasks and one should not apologize for using force when necessary. God knows that evil must be confronted in this way. (Perhaps this attitude is not so different from the actions of Western powers fighting to spread civilization, democracy, and/or human rights.) Qutb relates the “greater jihad” to this by describing it as the inner battle of the warrior to purify himself of personal desires and any other obstacles to his serving God and establishing God’s authority on earth.

In the present situation jihad takes effectively the form of revolution, though Qutb does not use this term. (He may be influenced by Mawdudi’s book, Jihad in Islam, which explicitly calls it “revolutionary struggle”, at least in the English translation.) Individuals or groups of Muslims must come together to organize their lives on the basis of Islam, thus giving birth to a new society and isolating themselves psychologically, though not physically, from the jahili society around them. These groups will for a long time devote themselves to studying and internalizing the basic Muslim creed, there is no god but God. This is what Muhammad did for thirteen years in Mecca, before any attempt to establish an Islamic society was made. Soon enough, the Muslim group will be attacked by the jahiliyya and have to respond in ways that probably include violence until it replaces or at least holds its own against the jahiliyya. In the early stages, violence is to be avoided except for self-defence though later it may be initiated, as mentioned above. All of this according to Qutb is based on the example of the Prophet’s actions in Mecca and Medina and represents a realization of the second part of the creed, “Muhammad is the Messenger of God.”

10. Human History

Qutb explicitly rejects the Enlightenment idea of continuous human progress, at least in the moral area. Rather, in accordance with the traditional Muslim view, history is characterized by a series of prophetic missions, often representing moral high points, followed by decline. The first prophet was the first man, Adam. Although he and his wife disobeyed God and were expelled from Paradise, they repented and were pardoned; their pure fitra was re-established though they now lived in a world of physical and moral struggle. Many of the ensuing prophets preached to peoples who rejected them and were destroyed by God, but some, in particular Ibrahim (Abraham), Musa (Moses), Daud (David), and ‘Isa (Jesus) left continuing communities, though these communities changed the revelations they had received. Each of the messengers taught the same truths about God and the universe, though in increasingly advanced forms as befit their societies’ development, until the human race reached its maturity and Muhammad brought the final revelation and most complete and universal message, confirming but superseding the previous messages. The high point of human moral and social history was the community in Medina under the prophet and his immediate successors. The Muslim community continued for some twelve centuries, often prospering politically and culturally though declining morally.

In the West, a corrupted form of Christianity was imposed on people and this eventually led to a rebellion against religion and to the anti-religious philosophies (“Positivism”, “Dialectical Materialism”, etc.) so prevalent by the twentieth century. The West also began to attack the Muslim world militarily during the medieval crusades and this crusading continued later in the form of Western imperialism. This is a common idea among Islamists today, who regularly refer to Westerners as crusaders. As a result of Western imperialism, Muslim societies began to adopt Western ways and abandoned the Shari’a, often without admitting it, so that by Qutb’s time there was no longer a truly Islamic society anywhere. The whole world is in a state of jahiliyya, and this jahiliyya, because of its material advancement and sophistication, is deeper than previous ones. Although the previous wave of Islam has left some traces, such as the idea of the unity of the human race, that might ease the rebirth of Islam, this will happen only by God’s will working through Islamic activists. A new Islamic society will not be morally better than the “unique Qur’anic generation” except (one may note, though Qutb does not say) that its moral status will be linked to much better technology.

11. Death, Judgment, Martyrdom

Qutb held to the traditional view that death is followed by resurrection on the Last Day, by divine judgment on the basis on one’s action, and a final abode in paradise or hell. This, finally, is the greatest motive for service to God in this life. How God will raise people to life after they are dead is one of the divine secrets that human reason cannot understand, just as it cannot understand the secret of life generally. He seems to take the Qur’anic descriptions of judgment, heaven, and hell, at face value, sometimes analysing the language and literary force of the accounts. These scenes are related to this world since worldly actions lead to them and worldly joy and suffering foreshadow them. They also widen the individual’s perspective beyond the bounds of this life. He also held to the common view that God has fixed the date of each person’s death, a good reason to risk martyrdom in revolutionary action.

The situation of martyrs, those who die in jihad, is distinctive. The Qur’an says, “Do not say of those who are killed in the path of God, ‘They are dead.’ They are alive . . .” (Qur’an 2:154; 3:169). Qutb says that they are alive in the sense that they continue to be an active force directing the community, but that they also may be more literally alive on another level of existence that we cannot conceive of. Toward the end of Milestones, he says that martyrs receive three rewards: contentment and freedom from fear and sorrow, praise from angels and humans and favorable accounting in the final judgment (I have seen no mention of 72 virgins, however). Qutb is considered a martyr by many, probably most, Muslims. It is reported that on learning that he was to be executed he praised God for earning martyrdom. Both Zaynab al-Ghazali and Qutb’s sister, Hamida, claimed to have had visions just after his death assuring them that he is in paradise.

12. Qutb’s Legacy

Qutb’s ideas, strengthened by his status as a martyr, have had considerable influence among Muslims. His close linking of belief in one God with the need for the rule of a divinely derived law, and his insistence on a clear line between Islam and non-Islam, has strengthened Islamism generally. His understanding of jahiliyya has broadened the scope and depth of the struggle. His conceptualization of the movement as one for “liberation” resonates with many people, as does his view that all forms of activity should be service to God. His understanding of jihad and his own martyrdom has strengthened the willingness for both violence and self-sacrifice. One young man, who was moved by his execution to join an Islamist cell, was Ayman al-Zawahiri, who later became a leader in the radical group Tanzim al-Jihad, and still later leader of al-Qaeda. Within the Muslim Brothers organization, Qutb’s legacy has been ambivalent, a threat to their ability to function with some freedom, but not possible to ignore. In 2009, his ideas were at the forefront of a debate between those who wanted less accommodation to secular society and those who wanted more.

Those who particularly claim to follow his legacy, mostly outside the Brothers, have commonly been called Qutbists or Qutbians. They include the so-called Takfir wa Hijra (the label refers to their condemnation of society and separation from it), Jama‘a Islamiyya (Islamic Group), and Tanzim al-Jihad (Jihad Organization) in Egypt, and al-Qaeda. (It is not clear where “Islamic State” or ISIS stands on Qutb.) They tend to simplify Qutb’s ideas or take them to extremes that he might not have accepted. This article considers their interpretations of some of Qutb’s ideas.

Qutb’s idea of jahiliyya is a fairly easy idea to misunderstand. It has often been interpreted as takfir, the declaration of individuals as unbelievers or apostates, usually applied to enemies or government representatives. Jama‘a Islamiyya and Tanzim al-Jihad spoke more of kufr than jahiliyya. They considered Egyptian society as a whole to be Muslims and only the leaders of society to be kafirs. On this assumption, some of the Tanzim al-Jihad members assassinated the Egyptian president in 1981, hoping by this to spark a revolt and overthrow the government, something that did not happen. On Qutb’s view of jahiliyya, this effort would have been hopelessly misguided and premature.

The leader of the so-called Takfir wa Hijra group, who had reportedly studied Qutb’s writings in prison, accepted the claim that the whole Egyptian society was jahili, but with a more extreme interpretation than Qutb’s. He claimed that any of its members who left his group were abandoning Islam and that the standard Friday prayers were illicit in a jahili society. He also tried to isolate the group physically from society more than Qutb called for. Outsiders have interpreted its position as takfir and apparently insiders have too, since they came to accept the label.

The distinction between the “near enemy” (their own rulers) and the “far enemy” (for example, Israel and the United States) made by the Jama‘a Islamiyya and Tanzim al-Jihad, and their choice to attack the “near enemies” first, does owe something to Qutb’s idea of jahiliyya, since this idea removes Egyptian society from the category of Islamic. Al-Qaeda’s view of the world-wide struggle also seems to fit Qutb’s idea, though al-Qaeda changed the priority to the “far enemy”. Qutb might have accepted this as a practical example of flexibility after the attack on the “near enemy” failed.

Qutb called for a long period of preparation before engaging in jihad, but Tanzim al-Jihad and Jama‘a Islamiyya advocated immediate action. While al-Qaeda trains its recruits militarily and indoctrinates them, it does not appear to provide the sort of long term spiritual preparation Qutb had in mind. The leader of Takfir wa-Hijra appreciated the need for a long period of preparation, which is one of the reasons he sought to isolate the group. He hoped to build a model community that would eventually be strong enough to bring down the government. Unfortunately for them, police arrested some of the group and the group in return kidnapped a former government minister, whom they killed when the government refused to release the prisoners. The government then cracked down and succeed in capturing and executing the group’s leaders.

While Qutb defended the need for, and almost the inevitability of, violence in certain circumstances, this was to counter those who downplayed it, often for apologetic reasons. It is doubtful (though impossible to know) whether Qutb would have approved of the terrorist activities of the Qutbist groups. For the most part, they do not make sense if jahiliyya is as deeply rooted as Qutb claims and, in any case, Qutb accepted the traditional fiqh view that non-combatants should not be targeted. Also, revenge has often been a motive for violent actions, but Qutb appears to have rejected that motive. Perhaps the most important contribution of Qutb’s theories is that they remove the legitimacy from the existing authorities for his followers and make the followers look ultimately like “paper tigers.”

Qutb has been criticized by traditional scholars on particular points of fiqh and history and generally for making judgments about religion without the sort of training they consider necessary. Also, Sunnis have generally taken the position that for this worldly purpose a person is to be treated as Muslim if he is outwardly one, whatever his behaviour, and likewise, the government is to be treated as Muslim as long as the rulers are outwardly so. Many see Qutb’s views about jahiliyya and jihad as violations of this.

Many who are not radical Islamists still appreciate many of Qutb’s ideas and ultimate goals. Often it is argued that his extreme views were the result of his imprisonment and torture and that, had he lived longer, his ideas and activities would have developed in a more moderate direction. They also like to call attention to his earlier works, which contain less extreme views than those discussed in this article.

13. Final Remarks: Aesthetics, Harmony, and
Essentialism

There is a strongly aesthetic dimension to Qutb’s writing, and one could say that its master theme is harmony. God’s universe is a perfectly harmonious system into which everything fits beautifully and practically. This universe is friendly to life, and human life can be in full harmony with it and blessed. Disharmony comes when humans act in ways that contravene the ways God has set out for them. The beauty of God’s harmony makes the disharmony introduced by humans all the worse, like a beautiful painting disfigured. Hence the horror of jahiliyya and seriousness of the effort to end it.

Connected with this is the resolutely essentialist nature of Qutb’s thinking. Everything is essentialized, including nature, humanity, gender, Islam, the West, jahiliyya, Shari’a, belief, and unbelief. Perhaps God is a partial exception, since His essence is unknowable and His freedom to produce miracles may break the regularities on which human essentialism depends. A major aspect of this essentialism is the dichotomously “Manichean” way in which he treats good and evil. As mentioned above, there is no mid-term between guidance and error, faith and unbelief, tawhid and shirk, or between Islam and jahiliyya or Shari’a and human legislation, Although the interpretation of the Shari’a may require human effort (ijtihad), and its application may vary with circumstances, the difference in principle between divinely sourced and humanly sourced is stark.

This combination of aesthetics, essentialism, and “Manicheism,” while very much open to criticism from scientists and philosophers, is undoubtedly one of the keys to the power of his ideology. The strong contrast between good and evil, the sense that evil is currently in charge in the world though good is in ultimate control, and the conviction that something can be done must and must be done at any cost to change this situation has characterized and driven many a revolutionary ideology.

14. References and Further Reading

a. Primary Sources

  • Qutb, Sayyid, In the Shade of the Qur’an (Fi zilal al-Qur’an), 18 vols, Translated & Edited by: M.A. Salahi & A.A Shamis, Leicester, UK: The Islamic Foundation, 1999-2005.
    • Qutb’s massive and popular commentary on the Qur’an. Much of it was written before his most radical period but the first 13 (of 30) parts were revised during that period.
  • Qutb, Sayyid, The Islamic Concept and its Characteristics (Khasa’is al-tasawwur al-islami wa-muqawwimatuhu), trans. Mohammed M. Siddiqui. Indianapolis: American Trust Publications, 1991.
    • The most “philosophical” of Qutb’s late works, used considerably for this article. It is the first of two volumes on the subject; the second has not been translated into English.
  • Qutb, Sayyid, Basic Principles of the Islamic Worldview, trans. Rami David. North Haledon, N.J.: Islamic Publications International IPI, 2006.
    • A later translation of the same work as above.
  • Qutb, Sayyid, Islam: The Religion of the Future (Al-mustaqbal li-hadha al-din), translator not given. Beirut and Damascus: The Holy Koran Publishing House, n.d.
    • A shorter book stating main point and emphasizing the need of humanity for Islam. Comments on quotes from Alexis Carrel and John Foster Dulles.
  • Qutb, Sayyid, Milestones (Ma‘alim fi al-tariq), trans. S. Badrul Hasan [?]. Kuwait: International Islamic Federation of Student Organizations, 1978. Also, Lahore: Kazi Publications, nd. The title is sometimes translated “Signposts”.
    • Qutb’s best known radical work, a handbook for Islamic revolution.
  • Qutb, Sayyid, Milestones, “revised translation”, translator not given. Indianapolis: American Trust Publications, 1990.
    • Claims to provide “a fresh editing and rereading” but I cannot confirm that does so from what I have read of it.
  • Qutb, Sayyid, This Religion of Islam (Hadha al-din), translator not given. Kuwait: International Islamic Federation of Student Organizations, 1972.
    • Summarizes the characteristics of the Islamic manhaj and its positive effect on the world in the past. Relatively optimistic.
  • [Qutb, Sayyid] Sayyid Qutb and Islamic Activism: A translation and critical analysis of Social Justice in Islam (Al-‘adala al-ijtima‘iyya fi al-islam). By William Shepard, Leiden: Brill, 1996.
    • Last edition of Qutb’s major work on Islamic social and political teachings. Comparisons are made with earlier editions.
  • The Sayyid Qutb Reader, ed. Albert J. Bergesen. Routledge, 2007.
    • Includes an introduction to Qutb’s career and ideas, and selections mainly from the radical parts of In the Shade of the Qur’an , along with some from Milestones, Social Justice in Islam, and A Child from the Village (autobiographical account of his childhood village, written before he became Islamist).

b. Secondary Sources

  • Abu-Rabi‘, Ibrahim, Intellectual Origins of Islamic Resurgence in the Modern Arab World. Albany: SUNY Press, 1996.
    • Chapter 3 deals with the Muslim Brothers and chapters 4 to 6 cover Qutb’s pre-Islamist, early Islamist and later Islamist thinking.
  • Calvert, John, Sayyid Qutb and the Origins of Radical Islamism. New York: Columbia University Press, 2010.
    • Excellent study of Qutb’s activities and writings during both is secularist and Islamist period; with helpful information on the social and political background and a survey of later “Qutbists”.
  • Carré, Olivier, Mysticism and Politics: A Critical Reading of Fî Zilal al-Qur’an by Sayyid Qutb (1906-1966), Leiden, Boston: Brill, 2003.
    • An in-depth study of Qutb’s Qur’an commentary. Includes selections from the text.
  • Haddad, Yvonne Y., ‘Sayyid Qutb: Ideologue of Islamic Revival’, ch. 4 in Voices of Resurgent Islam, ed. J. Esposito. New York and Oxford: Oxford U. P., 1983.
    • Includes a discussion of Qutb’s main concepts.
  • Kepel, Gilles, Muslim Extremism in Egypt: The Prophet and the Pharoah. Berkeley & Los Angeles, 1986 and Berkeley: University of California Press, 2003.
    • Chapters 1 and 2 discuss Qutb’s last years and Milestones. The rest of the book deals with later radical groups in Egypt.
  • Musallam, Adnan, From Secularism to Jihad: Sayyid Qutb and the Foundations of Radical Islamism. Praeger, 2005.
    • Thoughtful account of the whole of Qutb’s life, career and writings, especially good on the earlier years. Also deals with Qutb’s influence on later radicals.
  • Shepard, William, “Sayyid Qutb’s doctrine of Jahiliyya “, International Journal of Middle East Studies 35/4 (Nov. 2003): 521-545.
    • Discusses the background to and components of this doctrine.
  • Shepard, W., “Islam as a ‘System’ in the Later Writings of Sayyid Qutb”, Middle Eastern Studies 25/1 (January 1989): 31-50.
    • Discusses key terms such as tasawwur and manhaj.
  • Toth, James. Sayyid Qutb: The Life and Legacy of a Radical Islamic Intellectual. Oxford: Oxford UP, 2013.
    • A good study of Qutb’s life and ideas with a lot of interesting information.

Author Information

William E. Shepard
Email: w.shepard@snap.net.nz
University of Canterbury
New Zealand

Set Theory

Set Theory is a branch of mathematics that investigates sets and their properties. The basic concepts of set theory are fairly easy to understand and appear to be self-evident. However, despite its apparent simplicity, set theory turns out to be a very sophisticated subject. In particular, mathematicians have shown that virtually all mathematical concepts and results can be formalized within the theory of sets. This is considered to be one of the greatest achievements of modern mathematics. Given this achievement, one can claim that set theory provides a foundation for mathematics.

The foundational role of set theory and its mathematical development have raised many philosophical questions that have been debated since its inception in the late nineteenth century. For example, here are three: Does infinity exist, and if so, are there different kinds of infinity? Is there a mathematical universe? Are all mathematical problems solvable?

Before pursuing the philosophical issues concerning set theory, one should be familiar with a standard mathematical development of set theory. This article presents such a development. A companion article addresses the philosophical issues that are raised by set theory and its development.

In the late nineteenth century, the mathematician Georg Cantor (1845–1918) created and developed a mathematical theory of sets. This theory emerged from his proof of an important theorem in real analysis. In this proof, Cantor introduced a process for forming sets of real numbers that involved an infinite iteration of the limit operation. Cantor’s novel proof led him to a deeper investigation of sets of real numbers and to his theory of abstract sets. Cantor’s creation now pervades all of mathematics and offers a versatile tool for exploring concepts that were once considered to be ineffable, such as infinity and infinite sets.

Sections 1 and 2 below describe the “naïve” principles of set theory that were used and developed by Cantor. Then, Section 3 describes a more sophisticated (axiomatic) approach to set theory that arose from the discovery of Russell’s paradox. After identifying the Zermelo-Frankel axioms of set theory, Section 4 discusses Cantor’s well-ordering principle and examines how Cantor used the well-ordering principle to develop the ordinal and cardinal numbers. Section 5 considers controversies concerning the well-ordering principle and its equivalent, the axiom of choice. This is followed by introducing the cumulative hierarchy of sets, Kurt Gödel’s universe of constructible sets, and Paul Cohen’s method of forcing in Sections 6, 7, and 8, respectively. The latter two topics, explored in Sections 7 and 8, can be used to show that certain questions are unresolvable when assuming the Zermelo-Frankel axioms (with or without the axiom of choice). The next two sections address further developments in set theory that are intended to settle these and other unresolved questions; namely, Section 9 discusses large cardinal axioms, and Section 10 investigates the axiom of determinacy.

Table of Contents

  1. On the Origins
  2. Cantor’s Development of Set Theory
    1. Russell’s Paradox
  3. The Zermelo-Fraenkel Axioms
    1. The Axioms
    2. Classes
  4. Cantor’s Well-Ordering Principle
    1. Ordinal Numbers
    2. Cardinal Numbers
  5. The Axiom of Choice
    1. On Zermelo’s Proof of the Well-Ordering Principle
    2. Banach-Tarski Paradox
  6. The Cumulative Hierarchy
  7. Gödel’s Constructible Universe
  8. Cohen’s Forcing Technique
  9. Large Cardinal Axioms
  10. The Axiom of Determinacy
  11. Concluding Remarks
  12. References and Further Reading
    1. Primary Sources
    2. Secondary Sources
    3. Internet Sources

1. On the Origins

Let us first discuss a few basic concepts of set theory. A set is a well-defined collection of objects. The items in such a collection are called the elements or members of the set. The symbol “\in” is used to indicate membership in a set. Thus, if A is a set, we write x \in A to say that “x is an element of A,” or “x is in A,” or “x is a member of A.” We also write x \notin A to say that x is not in A. In mathematics, a set is usually a collection of mathematical objects, for example, numbers, functions, or other sets.

Sometimes a set is identified by enclosing a list of its elements by curly brackets; for example, a set of natural numbers A can be identified by the notation

A = \{1,2,3,4,5,6,7,8,9\}. 

More typically, one forms a set by enclosing a particular expression within curly brackets, where the expression identifies the elements of the set. To illustrate this method of identifying a set, we can form a set B of even natural numbers, using the above set A, as follows:

B = \{n \in A : n \text{ is even}\}. 

which can be read as “the set of n \in A such that n is even.” Of course,

\{n \in A : n is even\} = \{2,4,6,8\}. 

It is difficult to identify the genesis of the set concept. Yet, the idea of a finite collection of objects has existed for as long as the concept of counting. Indeed, mathematicians have been investigating finite sets and methods for measuring the size of finite sets since the beginning of mathematics. For example, the above two sets

A=\{1,2,3,4,5,6,7,8,9\}

B=\{2,4,6,8\} 

are finite sets. As every element in B is an element in A, the set B is said to be a subset of A, denoted by B \subseteq A. Since there are elements in A that are not in B, we say that B is a proper subset of A. Moreover, the number of elements in B is strictly smaller than the number of elements in A. Thus, one can say, “the whole A is greater in size than its proper part B.”

Infinite sets lead to an apparent contradiction. Consider the infinite sets:

C=\{0,1,2,3,\ldots \}

D=\{1,3,5,7, \ldots \}. 

We view the sets C and D as existing entities that both contain infinitely many elements. Thus, C and D are “completed infinities.” Observe that every element in D is in C, and that D is a proper subset of C. However, if, as many mathematicians once believed, “infinity cannot be greater than infinity,” then the whole C is not greater in size than its proper part D. This counterintuitive result was viewed by many early prominent mathematicians as being contradictory, as it appeared to conflict with the well-understood behavior of finite sets. These mathematicians thus concluded that the concept of a “completed infinity” should not be allowed in mathematics.

For this reason, before Cantor, a majority of mathematicians considered infinite collections to be mathematically illicit objects. Cantor was the first mathematician to view infinite sets as being legitimate mathematical objects that can coexist with finite sets. Clearly, the size of a finite set can be measured simply by counting the number of elements in the set. Cantor was the first to investigate the following question:

Can the concept of “size” be extended to infinite sets? 

Cantor addressed this question in the affirmative by using the concept of a function to measure and compare the sizes of infinite sets. Functions are widely used in science and mathematics. For sets A and B, we say that f is a function from A to B, denoted by f: A \rightarrow B, if and only if f is a relation (operation) that assigns to each element x in A, a single element f(x) in B. There are three important properties that a function might possess:

  • f: A \rightarrow B is an injection if and only if for each y in B there is at most one x in A such that f(x)=y.
  • f: A \rightarrow B is a surjection if and only if for each y in B there is at least one x in A such that f(x)=y.
  • f: A \rightarrow B is a bijection if and only if for each y in B there is exactly one x in A such that f(x)=y.

Observe that f: A \rightarrow B is an injection if and only if distinct elements in A are assigned to distinct elements in B; that is, for all x and a in A, if x \neq a, then f(x) \neq f(a). Also note that f: A \rightarrow B is a bijection if and only if f: A\rightarrow B is an injection and a surjection.

Cantor observed that two sets A and B have the same size if and only if there is a one-to-one correspondence between A and B, that is, there is a way of evenly matching the elements in A with the elements in B. In other words, Cantor noted that the sets A and B have the same size if and only if there is a bijection f: A \rightarrow B. In this case, Cantor said that A and B have the same cardinality. For an illustration, let \mathbb{N} = \{0, 1, 2, 3, 4, \ldots \} be the set of natural numbers and let E = \{0,2,4,6,8,\ldots\} be the set of even natural numbers. Now let f: \mathbb{N} \rightarrow E be defined by f(n)=2n. One can verify that f: \mathbb{N} \rightarrow E is a bijection and, thus, we obtain the following one-to-one correspondence between the set \mathbb{N} of natural numbers and the set E of even natural numbers:

Hence, each natural number n corresponds to the even number 2n, and each even natural number 2i is thereby matched with i \in \mathbb{N}. The bijection f: \mathbb{N} \rightarrow E specifies a one-to-one match-up between the elements in \mathbb{N} and the elements in E. Cantor concluded that the sets N and E have the same cardinality.

Cantor also defined what it means for a set C to be smaller, in size, than a set D. Specifically, he said that C has smaller cardinality (smaller size) than D if and only if there is an injection f: C \rightarrow D but there is no bijection g: C \rightarrow D. Cantor then proved that there is no one-to-one correspondence between the set of real numbers and the set of natural numbers. Cantor’s proof showed that the set of real numbers has larger cardinality than the set of natural numbers (Cantor 1874). This stunning result is the basis upon which set theory became a branch of mathematics.

The natural numbers 0, 1, 2, 3, \ldots are the whole numbers that are typically used for counting. The real numbers are those numbers that appear on the number line. For example, the natural number 2, the integer -3, the fraction 6/5, and all of the other rational numbers are real numbers. The irrational numbers, such as \sqrt{2} and \pi, are also real numbers. Again, let \mathbb{N} = \{0, 1, 2, 3, \ldots \} be the set of natural numbers, and let \mathbb{R} be the set of real numbers. If a set is either finite or has the same cardinality as the set of natural numbers, then Cantor said that it is countable. Since the set of real numbers \mathbb{R} is larger, in size, than the set of natural numbers \mathbb{N}, Cantor referred to the set \mathbb{R} as being uncountable.

After proving that the set of real numbers is uncountable, Cantor was able to prove that there is an increasing sequence of larger and larger infinite sets. In other words, Cantor showed that there are “infinitely many different infinites,” a result with clear philosophical and mathematical significance.

After his introduction of uncountable sets, in 1878, Cantor announced his Continuum Hypothesis (CH), which states that every infinite set of real numbers is either the same size as the set of natural numbers or the same size as the entire set of real numbers. There is no intermediate size. Cantor struggled, without success, for most of his career to resolve the Continuum Hypothesis. The problem persisted and became one of the most important unsolved problems of the twentieth century. After Cantor’s death, most set theorists came to believe that the Continuum Hypothesis is unresolvable.

Cantor’s profound results on the theory of infinite sets were counterintuitive to many of his contemporaries. Moreover, Cantor’s set theory violated the prevailing dogma that the notion of a “completed infinity” should not be allowed in mathematics. Thus, the outcry of opposition persisted. Influential mathematicians continued to argue that Cantor’s work was subversive to the true nature of mathematics. These mathematicians believed that infinite sets were dangerous fictional creations of Cantor’s imagination and that Cantor’s fictions needed to be eradicated from mathematics (Dauben 1979, page 1) (Dunham 1990, pp. 278-280). Nevertheless, Cantor’s theory of sets soon became a crucial tool used in the discovery and establishment of new mathematical results, for example, in measure theory and the theory of functions (Kanamori 2012). Mathematicians slowly began to see the utility of set theory to traditional mathematics. Accordingly, attitudes started to change and Cantor’s ideas began to gain acceptance in the mathematical community (Dauben 1979, pp. 247-248). The significance of Cantor’s mathematical research was eventually recognized. David Hilbert, a prominent twentieth century mathematician, described Cantor’s work as being

the finest product of mathematical genius and one of the supreme achievements of purely intellectual human activity. (Hilbert 1923)

Ultimately, Cantor’s theory of abstract sets would dramatically change the course of mathematics.

2. Cantor’s Development of Set Theory

In his development of set theory, Cantor identified a single fundamental principle, called the Comprehension Principle, under which one can form a set. Cantor’s principle states that, given any specific property \varphi(x) concerning a variable x, the collection \{x : \varphi(x)\} is a set, where \{x : \varphi(x)\} is the set of all objects x that satisfy the property \varphi(x). For example, let \psi(x) be the property that “x is an odd natural number.” The Comprehension Principle implies that

S = \{ x : \psi (x)\} = \{1,3,5,7,\ldots \} 

is a set. Employing the Comprehension Principle, one can form the intersection of two sets A and B using the property “x \in A and x \in B“; thus, the intersection of A and B is the set

A \cap B = \{x : x \in A and x \in B\}. 

One can also form the set

A \cup B = \{x : x \in A or x \in B\} 

which is called the union of A and B. Recall that one writes X \subseteq A to mean that X is a subset of A, that is, every element of X is also an element of A. Using the Comprehension Principle, one can form the power set of A, which is the set whose elements are all of the subsets of A, that is,

\wp(A) = \{ X : X \subseteq A\}. 

Thus, if A is a set and X \subseteq A, then X \in \wp(A). So, if A = \{1,2,3\} and B = \{3,4,5\}, then

A \cap B = \{3\},
A \cup B = \{1,2,3,4,5\}, and

\wp(A) = \{\varnothing,\{1\},\{2\},\{3\},\{1,2\},\{1,3\},\{2,3\},\{1,2,3\}\}, 

where \varnothing denotes the empty set, that is, the set that contains no elements. The Comprehension Principle was an essential tool that allowed Cantor to form many important sets. Cantor’s approach to set theory is often referred to as naïve set theory.

Cantor’s set theory soon became a very powerful tool in mathematics. In the early 1900s, the mathematicians Émile Borel, René-Loius Baire, and Henri Lebesgue used Cantor’s set theoretic concepts to develop modern measure theory and function theory (Kanamori 2012). This work clearly demonstrated the great mathematical utility of set theory.

a. Russell’s Paradox

The philosopher and mathematician Bertrand Russell was interested in Cantor’s work and, in particular, Cantor’s proof of the following theorem, which implies that the cardinality of the power set of a set is larger than the cardinality of the set. First, recall that a function g: A \rightarrow B is a surjection (or is onto B) if for all y \in B, there is an x \in A such that g(x)=y.

Cantor’s Theorem. Let A be a set. Then there is no surjection f: A \rightarrow \wp(A). 

Proof. Suppose, for the sake of obtaining a contradiction, that there exists a surjection f: A \rightarrow \wp(A). Observe that, for all z \in A, f(z) \subseteq A. By the Comprehension Principle, let X be the set

X = \{x : x \in A and x \notin f(x)\}. 

Clearly, X \subseteq A. Thus, X \in \wp (A). As f is onto \wp(A), there is an a \in A such that f(a) = X. There are two cases to consider: either a \in X or a \notin X. If a \in X, then the definition of X implies that a \notin f(a). Since f(a) = X, we have that a \notin X, which is a contradiction. On the other hand, if a \notin X, then the definition of X implies that a \in f(a). Since f(a) = X, we see that a \in X, a contradiction. Thus, there is no surjection f: A \rightarrow \wp(A). This completes the proof.

In 1901, after reading Cantor’s proof of the above theorem, that was published in 1891, Bertrand Russell discovered a devastating contradiction that follows from the Comprehension Principle. This contradiction is known as Russell’s Paradox. Consider the property “x \notin x”, where x represents an arbitrary set. By the Comprehension Principle, we conclude that

A = \{x : x \notin x\} 

is a set. The set A consists of all the sets x that satisfy x \notin x. Clearly, either A \in A or A \notin A. Suppose A \in A. Then, the definition of the set A implies that A must satisfy the property A \notin A, which contradicts our supposition. Suppose A \notin A. Since A satisfies A \notin A, we infer, from the definition of A, that A \in A, which is also a contradiction.

There were similar paradoxes discovered by others, including Cantor (Dauben 1979), but Russell’s paradox is the easiest to understand. These paradoxes appeared to threaten Cantor’s fundamental principle that he used to develop set theory. Nevertheless, Cantor did not believe that these paradoxes actually refuted his development of set theory. He knew that the construction of certain collections can lead to a contradiction. Cantor referred to these collections as “inconsistent multiplicities.” Today, such collections are called proper classes, and the paradoxes can be used to prove that they are not sets.

3. The Zermelo-Fraenkel Axioms

Over time, it became clear that, to resolve the paradoxes in Cantor’s set theory, the Comprehension Principle needed to be modified. Thus, the following question needed to be addressed:

How can one correctly construct a set? 

Ernst Zermelo (1871–1953) observed that to eliminate the paradoxes, the Comprehension Principle could be restricted as follows: Given any set A and any property \psi (x), one can form the set \{x \in A : \psi (x)\}, that is, the collection of all elements x \in A that satisfy \psi (x), is a set. Zermelo’s approach differs from Cantor’s method of forming a set. Cantor declared that for every property one can form a set of all the objects that satisfy the property. Zermelo adopted a different approach: To form a set, one must use a property together with a set.

Zermelo also realized that in order to more fully develop Cantor’s set theory, one would need additional methods for forming sets. Moreover, these additional methods would need to avoid the paradoxes. In 1908, Zermelo published an axiomatic system for set theory that, to the best of our knowledge, avoids the difficulties faced by Cantor’s development of set theory. In 1930, after receiving some proposed revisions from Abraham Fraenkel, Zermelo presented his final axiomatization of set theory, now known as the Zermelo-Fraenkel axioms and denoted by ZF. These axioms have become the accepted formulation of Cantor’s ideas about the nature of sets.

a. The Axioms

As noted by Zermelo, to avoid paradoxes, the Comprehension Principle can be replaced with the principle: Given a set A and a property \varphi (x) with a variable x, the collection \{x \in A : \varphi (x)\} is a set. However, this raises a new question: What is a property? The most favored way to address this question is to express the axioms of set theory in the formal language of first-order logic, and then declare that its formulas designate properties. This language involves variables and the logical connectives \wedge (and), \vee (or), \neg (not), → (if … then …), and ↔ (if and only if), together with the quantifier symbols \forall (for all) and \exists (there exists). In addition, this language uses the relation symbols = and \in (as well as \neq and \notin). In this language, the variables and quantifiers range over sets and only sets. A formula constructed in this formal language is referred to as a formula in the language of set theory. Such formulas are used to give meaning to the notion of “property.”

We now illustrate the expressive power of this set theoretic language. The formula \exists x(x \in A) asserts that “the set A is nonempty,” and \forall x(x \notin A) states that “A has no elements.” Moreover \neg \exists x \forall y(y \in x) states that “it is not the case that there is a set that contains all sets as elements.” In addition, one can translate English statements, which concern sets, into the language of set theory. For example, the English sentence “the set A contains at least two elements” can be translated into the language of set theory by \exists x \exists y((x \in A \wedge y \in A) \wedge x \neq y).

There is another quantifier, called the uniqueness quantifier, that is sometimes used. This quantifier is written as \exists ! x \varphi (x) and it means that “there exists a unique x satisfying \varphi (x).” This is in contrast with \exists x \varphi(x), which simply states that “at least one x satisfies \varphi (x).” The uniqueness quantifier is used as a convenience, as the assertion \exists !x \varphi (x) can be expressed in terms of the other quantifiers \exists and \forall; namely, it is equivalent to the formula

\exists x \varphi (x) \wedge \forall x \forall y ((\varphi (x) \wedge \varphi (y)) \rightarrow x=y). 

The above formula is equivalent to \exists!x \varphi (x) because it asserts that “there is an x such that \varphi(x) holds, and any sets x and y that satisfy \varphi (x) and \varphi(y) must be the same set.”

The Zermelo-Fraenkel axioms are listed below. Each axiom is first stated in English and then written in logical form. After each logical form, there is a discussion of the axiom and some of its consequences. When reading these axioms, keep in mind that, in Zermelo-Fraenkel set theory, everything is a set, including the elements of a set. Also, the notation \vartheta (x, \ldots, z) means that x, \ldots, z are free variables in the formula \vartheta and that \vartheta is allowed to contain parameters (free variables other than x, \ldots, z) that represent arbitrary sets.

a)

Extensionality Axiom. Two sets are equal if and only if they have the same elements. 

\forall A \forall B ( A = B \leftrightarrow \forall x ( x \in A \leftrightarrow x \in B)). 

The extensionality axiom is essentially a “definition” that states that two sets are equal if and only if they have exactly the same elements.

b)

Empty Set Axiom. There is a set with no elements. 

\exists A \forall x ( x \notin A). 

The empty set axiom states that there is a set which has no elements. Since the extensionality axiom implies that this set is unique, we let \varnothing denote the empty set.

c)

Subset Axiom. Let \varphi(x) be a formula. For every set A, there is a set S that consists of all the elements x \in A such that \varphi(x) holds. 

\forall A \exists S \forall x ( x \in S \leftrightarrow ( x \in A \wedge \varphi (x))). 

(The variable S is assumed not to appear in the formula \varphi (x).) The subset axiom, also known as the axiom of separation, asserts that any definable sub-collection of a set is itself a set, that is, for any formula \varphi(x) and any set A, the collection \{x \in A : \varphi(x)\} is a set. Clearly, the subset axiom is a limited form of the Comprehension Principle. Yet, it does not lead to the contradictions that result from the Comprehension Principle. The subset axiom is, in fact, an axiom schema since it yields infinitely many axioms-one for each formula \varphi.

d)

Pairing Axiom. For every u and v, there is a set that consists of just u and v. 

\forall u \forall v \exists P \forall x ( x \in P \leftrightarrow ( x =u \vee x = v)). 

The pairing axiom states that, for any two sets u and v, the set \{u, v\} exists. Thus, by the extensionality axiom, the set \{u, u\} = \{u\} exists.

e)

Union Axiom. For every set F, there exists a set U that consists of all the elements that belong to at least one set in F. 

\forall F \exists U \forall x ( x \in U \leftrightarrow \exists C (C \in F \wedge x \in C)). 

The union axiom states that, for any set F, there is a set U whose elements are precisely those elements that belong to an element of F, that is, x \in U if and only if x \in A for some A \in F. The extensionality axiom implies that the set U is unique; it is often denoted by \bigcup F. For example, consider the set \{A,B\}. Then

\bigcup \{A,B\} = \{x : x belongs to a member of \{A,B\}\} = \{x : x \in A or x \in B\} = A \cup B. 

For another example, let F = \{ \{a,b,c\},\{e,f\},\{e,c,d\} \}. Then \bigcup F = \{a,b,c,d,e,f\}.

f)

Power Set Axiom. For every set A, there exists a set P that consists of all the sets that are subsets of A. 

\forall A \exists P \forall x ( x \in P \leftrightarrow \forall y( y \in x \rightarrow y \in A)). 

The power set axiom states that, for any set A, there is a set, which we denote by \wp(A), such that for any set B, B \in \wp(A) if and only if B \subseteq A.

g)

Infinity Axiom. There is a set I that contains the empty set as an element and whenever x \in I, then x \cup \{x\} \in I. 

\exists I ( \varnothing \in I \wedge \forall x (x \in I \rightarrow x \cup \{ x \} \in I)). 

The infinity axiom ensures the existence of at least one infinite set. For any set x, the successor of x is defined to be the set x^{+} = x \cup \{x\}. Thus, the axiom of infinity asserts that there is a set I such that \varnothing \in I and if x \in I, then x^{+} \in I. Note that \varnothing^{+} = \{\varnothing\}, and that \{\varnothing\}^{+} = \{\varnothing,\{\varnothing\}\}. It follows that the set I contains each of the sets

\varnothing; \{\varnothing\}; \{\varnothing, \{\varnothing \}\}; \{\varnothing, \{\varnothing, \{\varnothing \}\}\}; \ldots. 

One can show that any two of the sets in the above list (separated by a semi-colon) are distinct. Hence, the set I contains an infinite number of elements; in other words, I is an infinite set. So, the infinity axiom simply states that infinite sets exist and are legitimate mathematical objects. The infinity axiom is a key tool that is used to develop the set of natural numbers \mathbb{N} and to prove that \mathbb{N} is well-ordered, that is, every nonempty set of natural numbers has a least element.

h)

Replacement Axiom. Let \psi (x, y) be a formula. For every set A, if for each x \in A there is a unique y such that \psi (x, y), then there is a set S that consists of all of the elements y such that \psi (x, y) for some x \in A. (Below, \exists! is the uniqueness quantifier.) 

\forall A (\forall x ( x \in A \rightarrow \exists ! y \psi (x,y)) \rightarrow \exists S \forall y( y \in S \leftrightarrow \exists x (x \in A \wedge \psi(x, y)))). 

(The variable S is assumed not to appear in the formula \psi (x, y).) The replacement axiom states that for every set A, if for each x \in A there is a unique y such that \psi(x,y), then the collection \{y : \exists x (x \in A \wedge \psi(x,y))\} is a set; that is, a “functional image of a set, is a set.” The replacement axiom is a special form of Cantor’s Comprehension Principle that plays a critical role in modern set theory. However, the replacement axiom does not lead to the contradictions that follow from the Comprehension Principle. Like the subset axiom, the replacement axiom is an axiom schema. Accordingly, there are infinitely many Zermelo-Fraenkel axioms.

i)

Regularity Axiom. Each nonempty set A contains an element that is disjoint from A. 

\forall A ( A \neq \varnothing \rightarrow \exists x ( x \in A \wedge \neg \exists y ( y \in x \wedge y \in A))). 

The regularity axiom, also known as the axiom of foundation, states that, for any nonempty set A, there is a set x \in A such that A \cap x = \varnothing. The regularity axiom rules out the possibility of a set belonging to itself. In standard mathematics, there are no sets that are members of themselves. For example, the set of natural numbers is not a natural number. The regularity axiom eliminates collections that are not relevant for standard mathematics. The regularity and pairing axioms imply that if a \in b, then b \notin a. To see this, suppose that a \in b. Then it follows, from regularity, that a \cap \{a,b\} = \varnothing. So b \notin a.

The Zermelo-Fraenkel axioms are now the most widely accepted answer to the question: How can one correctly construct a set? Of course, these axioms are more restrictive than Cantor’s Comprehension Principle; however, no one, in over 100 years, has been able to derive a contradiction from these axioms. Moreover, all of the classic results (excluding the paradoxes) that were derived using Cantor’s naïve set theory can be derived from the Zermelo-Fraenkel axioms.

It is a remarkable fact that essentially all mathematical objects can be defined as sets within Zermelo-Fraenkel set theory. For example, functions, relations, the natural numbers, and the real numbers can be defined within Zermelo-Fraenkel set theory. Hence, effectively all theorems of mathematics can be considered as statements about sets and proven from the Zermelo-Fraenkel axioms.

b. Classes

The argument used in Russell’s Paradox can be applied to prove, in ZF, that there is no set that contains all sets (as elements). As every set is equal to itself, the collection \{x : x = x\} contains every set, but this collection is not a set. Thus, given a formula \varphi(x), one cannot necessarily conclude that the collection \{x : \varphi(x)\} is a set. However, in set theory, it is convenient to be able to discuss such collections. They cannot be called sets. Instead, a collection of the form \{x : \varphi(x)\} is called a class. The collection \{x : x = x\} is a class that is not a set; for this reason, it is called a proper class.

When can one prove that a class is a set? Let us say that a class \{x : \varphi(x)\} is bounded if and only if there is a set A such that for all x, if \varphi(x), then x \in A. Using the subset axiom, one can prove that a bounded class is a set. It follows that the class \{x : x = x\} is not bounded.

In the Zermelo-Fraenkel axioms, there is no explicit mention of classes. However, there are alternative axiomatizations of set theory that extend ZF by including classes as objects in the language, that is, these axiom systems give classes a formal state of existence. The most common such axiomatic treatment of classes is denoted by NBG (von Neumann–Bernays–Gödel). The NBG system uses a formal language that has two different types of variables: capital letters denote classes and lowercase letters denote sets. In addition, classes can contain only sets as elements. So, a class that is not a set cannot belong to a class. Thus, a class X is a set if and only if \exists Y (X \in Y). In the NBG system, sets satisfy all of the ZF axioms, and the intersection of a class with a set is a set, that is, X \cap y is a set. The NBG system also has the class comprehension axiom:

\exists X \forall y (y \in X \leftrightarrow \varphi (y)) 

where the formula \varphi(y) can contain set parameters and/or class parameters (with other restrictions). Thus, the class comprehension axiom asserts that \{x : \varphi(x)\} is a class.

The NBG system is a conservative extension of ZF; that is, a sentence with only lowercase (set) variables is provable in NBG if and only if it is provable in ZF. The Zermelo-Fraenkel system has a clear advantage over NBG, namely, the simplicity of working with only one type of object (sets) rather than two types of objects (sets and classes). The Zermelo-Fraenkel axiomatic system is the standard system of axioms for modern set theory.

4. Cantor’s Well-Ordering Principle

As proposed by Cantor, two sets A and B have the same cardinality if and only if there is a bijection f: A \rightarrow B. When A is a finite set, there is a unique natural number, denoted by |A|, that identifies the number of elements in A. In this case, we say that |A| is the cardinality of A. For example, if A = \{3,5,7,2\}, then |A| = 4. Clearly, the cardinality of a finite set identifies the number of elements that are in the set. Moreover, if A and B are both finite sets, then one can prove that

|A| = |B| if and only if there exists a bijection f: A \rightarrow B.

(\Delta) 

With this understanding, Cantor asked the following question:

Are there values that can represent the size of infinite sets and satisfy (\Delta)?

In other words, given two infinite sets A and B, can one assign values |A| and |B| such that

|A| = |B| if and only if there exists a bijection f: A \rightarrow B? 

Cantor answered this question, in the affirmative, by developing the transfinite ordinal numbers, which are “infinite numbers” in the sense that they are larger than all of the natural numbers, and are well-ordered just like the natural numbers. Cantor believed that each infinite set can be assigned a specific ordinal number and that this ordinal number would measure the size of the set. Cantor realized that, in order to successfully apply his theory of ordinal numbers, he needed an additional principle. In 1883, he proposed the following principle.

Well-Ordering Principle: It is always possible to bring any well-defined set into the form of a well-ordered set. 

A relation \leq on a set X is a well-ordering of X if and only if it is a total ordering in which every non-empty subset of X has a least element, where it is assumed that the relation \leq does not apply to any elements that are not in X. If a set can be well-ordered, then one can generalize the concepts of induction and recursion, similar to mathematical induction, on the elements of the set. Given any infinite set, Cantor used the well-ordering principle to identify an ordinal number that measures the size of the set. Such an ordinal is called a cardinal number.

a. Ordinal Numbers

The natural numbers are often used for two purposes: to indicate the position of an element in a sequence and to identify the size of a finite set. In other words, a natural number can be used to identify a position (first, second, third, …) and it can be used to identify a size (one, two, three, …). Cantor extended the natural numbers by introducing the concepts of transfinite position and transfinite size. Suppose that we want to count the number of real numbers. As noted in Section 1, Cantor proved that the set of real numbers is uncountable. Thus, if we attempted to assign each real number to exactly one of the natural numbers 0, 1, 2, 3, \ldots, then we would not have enough natural numbers to complete this task. However, suppose that we add some new numbers, called transfinite ordinals, to our stock of numbers. Clearly, we need an ordinal that will identify the first position that occurs after all of the natural numbers. Cantor denoted this ordinal by the Greek letter \omega. That is, Cantor proposed the following “position” sequence

0, 1, 2, 3, 4, \ldots, \omega.

(1) 

Observe the following:

  • By starting with 0 and repeatedly adding 1, we obtain all of the natural numbers.
  • Every natural number greater than 0 has an immediate predecessor; for example, 5 has 4 as its immediate predecessor.

By contrast, the ordinal number \omega cannot be obtained by repeatedly adding 1 to 0 and it does not have an immediate predecessor. For these reasons, we say that \omega is a limit ordinal.

We can continue the sequence (1) by repeatedly adding 1 to \omega. By doing so, we obtain the following position sequence:

0, 1, 2, 3, 4, \ldots, \omega, \omega+1, \omega+2, \omega+3, \ldots

(2) 

The process for constructing (1) and (2) can be repeated endlessly. In this way, we obtain the ordered sequence of all of the ordinals:

0, 1, 2, 3, 4, \ldots, \omega, \omega+1, \omega+2, \ldots ,\omega+\omega,(\omega+\omega)+1,(\omega+\omega)+2, \ldots

(3) 

where \omega+\omega is a limit ordinal which is usually represented by 2 \cdot \omega. An ordinal of the form \alpha+1 is called a successor ordinal. An ordinal \delta > 0 that is not a successor ordinal is called a limit ordinal. Cantor used the ordinals to measure the “length” of a well-ordered set.

The natural numbers 0, 1, 2, 3, 4, \ldots are sometimes called finite ordinals. Every nonempty subset of the natural numbers has a least element. Similarly, every nonempty set of ordinals has a least element with respect to the ordering in (3). The ordinal numbers are a generalized extension of the natural numbers. One can define the operations of addition, multiplication, and exponentiation on the ordinal numbers. These operations satisfy some (but not all) of the arithmetic properties that hold on the natural numbers, for example, addition is associative (Cunningham 2016).

The set of predecessors of an ordinal is the set of all of the ordinals that come before it in the list (3); for example, the set of predecessors of \omega and \omega+1 are the respective sets

\mathbb{N} = \{0, 1, 2, 3, 4, \ldots\}, N' = \{0, 1, 2, 3, 4, \ldots , \omega \}.
(4)

The ordinals \omega and \omega+1 represent different positions in the list (3); but, the sets \mathbb{N} and N' in (4) have the same cardinality. Note that the cardinality of \mathbb{N} is larger than any finite set, that is, for any natural number n, the set \mathbb{N} has cardinality larger than the set \{0, 1, 2, \ldots, n\}. For this reason, we say that \omega is a cardinal number.

For any two ordinals \alpha and \beta, we say that \alpha < \beta if and only if \alpha appears before \beta in the list (3). For each ordinal \gamma, let Pred(\gamma) = \{\alpha : \alpha < \gamma\} be the set of predecessors of \gamma. One can prove, in ZF, that Pred(\gamma) is a set. In contemporary set theory one usually defines the ordinals so that, for each ordinal \gamma, \gamma = Pred(\gamma); that is, each ordinal is defined to be the set of its predecessors. Specifically, a set \gamma is said to be an ordinal if and only if \gamma is well-ordered by the membership relation and is transitive, that is, every element in \gamma is a subset of \gamma. Thus, if \alpha < \beta, then \alpha \in \beta and \alpha \subseteq \beta. For example, \omega = \{0, 1, 2, 3, 4, \ldots\} is an ordinal if the integers (the finite ordinals) are defined as follows:

  • 0 = \varnothing,
  • 1 = \{0\},
  • 2 = \{0,1\},
  • 3 = \{0,1,2\},
  • 4 = \{0,1,2,3\}.

This approach is due to Von Neumann (Kunen 2009), and such ordinals can be called Von Neumann ordinals. The collection of all ordinals is a proper class (see Cunningham 2016).

b. Cardinal Numbers

An ordinal number \kappa is said to be a cardinal if and only if, for all \alpha < \kappa, the set Pred(\alpha) has smaller cardinality than Pred(\kappa). It follows that the natural numbers are all cardinals. As noted above, \omega is the first transfinite cardinal, which is often denoted by \aleph_{0}. The next transfinite cardinal, after \aleph_{0}, is designated by \aleph_{1}. This process can be continued to produce the following sequence of finite and transfinite cardinals:

0, 1, 2, 3, 4, \ldots, \aleph_{0}, \aleph_{1}, \ldots, \aleph_{\omega}, \aleph_{\omega+1}, \ldots, \aleph_{2 \cdot \omega}, \ldots, \aleph_{\omega \cdot \omega}, \ldots

(5) 

where the transfinite cardinal numbers in (5) are indexed by the ordinal numbers. Thus, the collection of all the cardinal numbers is a proper class. A cardinal \aleph_{\beta} is called a successor cardinal if and only if \beta is a successor ordinal; otherwise, it is called a limit cardinal. One can prove, in ZF, that, for every cardinal \kappa, there is an ordinal \alpha such that \kappa = \aleph_{\alpha} (Cunningham 2016). Thus, every cardinal appears on the list (5). One can define the operations of addition, multiplication, and exponentiation on the cardinals (exponentiation requires the well-ordering principle). These particular operations are not the same as the corresponding operations on the ordinal numbers (Cunningham 2016).

Cantor used the cardinal numbers to measure the “size” of sets. The well-ordering principle implies that every set A can be assigned a (unique) cardinal number that measures its size. This cardinal number is usually denoted by |A|, and is called the cardinality of A. Cantor’s Theorem implies that, for any set A, |A| < |\wp(A)|. The operation of cardinal exponentiation allowed Cantor to prove that the cardinality of \mathbb{R}, the set of real numbers, is equal to 2^{\aleph_{0} }, that is, |\mathbb{R}| = 2^{\aleph_{0}}. Since \aleph_{1} is the first cardinal greater than \aleph_{0}, Cantor was able to express the Continuum Hypothesis in terms of the equation 2^{\aleph_{0}} = \aleph_{1}. Moreover, assuming the well-ordering principle, one can conclude that a set A is countable if and only if |A| \leq \aleph_{0} and that a set B is uncountable if and only if \aleph_{1} \leq |B|.

Infinite cardinals come in two distinct forms: regular or singular. An infinite cardinal \kappa is said to be a regular cardinal if and only if \kappa is not the union of a set consisting of less than \kappa many smaller cardinals. Thus, if \kappa is a regular cardinal, S is a set of cardinals smaller than \kappa, and |S| < \kappa, then \kappa \neq \bigcup S. Assuming the well-ordering principle, it follows that each successor cardinal is a regular cardinal. When a cardinal is not regular, it is called a singular cardinal. One can show that an infinite cardinal \kappa is singular if and only if there exists an ordinal \beta < \kappa and a function f: Pred(\beta) → Pred(\kappa) such that for all \gamma < \kappa there is an ordinal \alpha < \beta such that \gamma < f(\alpha). It follows that \aleph_{\omega} is a singular cardinal.

5. The Axiom of Choice

At the third International Congress of Mathematicians at Heidelberg in 1904, Julius König submitted a proof that the well-ordering principle is false; in particular, he presented an argument showing the set of real numbers cannot be well-ordered. On the next day, Ernst Zermelo identified an error in König’s purported proof. Shortly after the Heidelberg congress, Zermelo (Moore 2012) discovered a proof of the following theorem, which implies that the error found in König’s proof cannot be removed.

Well-Ordering Theorem: Every set can be well-ordered 

In his clever proof of the well-ordering theorem, Zermelo formulated and applied the following principle, which he was the first to identify.

Axiom of Choice (AC). Let T be a set of nonempty sets. Then there is a function F such that, for each set A in T, F(A) \in A. 

The function F mentioned in AC is called a choice function for the set T. Informally, the axiom of choice asserts that, for any collection of nonempty sets, it is possible to uniformly choose exactly one element from each set in the collection. When T is a finite set, one can prove, in ZF, that there exists a choice function. Today, mathematicians use the axiom of choice when the set T is infinite and it is not clear how to define or construct a desired choice function.

Zermelo applied the axiom of choice to establish the well-ordering theorem. The well-ordering theorem validates both Cantor’s well-ordering principle and that every set can be assigned a cardinal number that measures its size.

a. On Zermelo’s Proof of the Well-Ordering Principle

Zermelo’s proof of the well-ordering theorem is the first mathematical argument that explicitly invokes the axiom of choice. As a result, the proof can be viewed as an important moment in the development of modern set theory. For this reason, we now present a summary of this proof. Let A be a nonempty set and let T be the set of all nonempty subsets of A; that is, let

T = \{ X \in \wp (A) : X \neq \varnothing \}. 

Let \gamma be a choice function for T. Call a set X \in T a \gamma-set if and only if there is a well-ordering \leq of X such that, for each a \in X,

\gamma(\{z \in Aza\}) =a . 

Thus, each element a \in X is the element that the choice function \gamma selects from the set of all elements in A that do not (strictly) precede a in the ordering \leq. For example, if w = \gamma(A), then one can show that \{w\} is a \gamma-set. Thus, \gamma-sets exist. Let X be a \gamma-set with well ordering \leq and let Y be a \gamma-set with well-ordering \leq'. In his proof, Zermelo showed that either X \subseteq Y and \leq' continues \leq or Y \subseteq X and \leq continues \leq', where we say that \leq' continues \leq when the order \leq' only adds new elements that are greater than all of the elements ordered by \leq. Zermelo also showed that the union of all of the \gamma-sets is a \gamma-set and that this union equals A. Therefore, A can be well-ordered.

Essentially, the axiom of choice states that one can make infinitely many arbitrary choices. As noted above, Cantor’s acceptance of infinite sets led to a dispute among some of Cantor’s contemporaries. Similarly, Zermelo’s axiom of choice incited further controversy concerning the infinite. The main objection to the axiom of choice was the obvious one: How can the existence of a choice function be justified when such a function cannot be defined or explicitly constructed? Surprisingly, many of the axiom’s severest critics had unwittingly applied the axiom in their own work. In the decades following its introduction, the axiom of choice gained acceptance among most mathematicians; in part, this was because the axiom of choice is a very useful principle whose deductive strength is required to prove many important mathematical theorems (Moore 2012). Moreover, the axiom of choice is equivalent to a number of seemingly unrelated principles in mathematics. For example, in ZF, the axiom of choice is equivalent to Zorn’s lemma, the well-ordering theorem, and the comparability theorem (see Cunningham 2016).

The Zermelo-Fraenkel system of axioms is denoted by ZF and the axiom of choice is abbreviated by AC. The axiom of choice is not one of the axioms in ZF. The result of adding the axiom of choice to the system ZF is denoted by ZFC.

There were many unsuccessful attempts to prove the axiom of choice assuming only the axioms in ZF. As a result, mathematicians began to doubt the possibility of proving the axiom of choice from the axioms in ZF and, eventually, it was shown that such a proof does not exist. The combined work of Kurt Gödel, in 1940, and Paul Cohen, in 1963, confirmed that the axiom of choice is independent of the Zermelo-Fraenkel axioms, that is, AC cannot be proven or refuted using just the axioms in ZF. Nevertheless, the axiom of choice is a powerful tool in mathematics and there are many significant theorems that cannot be established without it. Consequently, mathematicians typically assume the axiom of choice and often cite it when they use it in a proof.

b. Banach-Tarski Paradox

Set theory frequently deals with infinite sets. Moreover, as we have seen, there are times when infinite sets have properties that are unlike those of finite sets. Such properties of infinite sets can appear to be counter-intuitive or paradoxical, because they conflict with the behavior of finite sets or with our limited intuition. Cantor proved a theorem that illustrates this fact. Let I denote the unit interval \lbrack 0,1 \rbrack, that is, the set of all real numbers x such that 0 \leq x \leq 1. Let S denote the unit square in the plane, that is, the set of all ordered pairs (x,y) such that such that 0 \leq x \leq 1 and 0 \leq y \leq 1. The sets I and S appear in the following figure:


Cantor initially believed that the set of points in the two-dimensional square S must have cardinality much larger than the set of points in the one-dimensional interval I. Then he discovered a proof showing that his initial intuition was wrong. Cantor’s theorem below, which can be proven without the axiom of choice, shows the sets I and S have the same cardinality.

Theorem (Cantor). There exists a bijection f: I \rightarrow S. 

One can use the bijection f: I \rightarrow S to proclaim that one can, theoretically, disassemble all of the points in the interval I and then reassemble these points to obtain the unit square S. This, of course, is counter-intuitive, as we know that one cannot cut-up a 1-foot piece of thread and then put the pieces together to obtain a square-foot piece of fabric. Thus, there are infinite abstract objects that do not behave in the same way as finite concrete objects.

We now present a theorem due to Stefan Banach and Alfred Tarski (1924). The proof of this theorem uses the axiom of choice, in an essential manner, to prove another counter-intuitive result. Some have claimed that this theorem thus refutes the axiom of choice. First, we identify some terminology. In three-dimensional space, a unit ball is a set of points of distance less than or equal to 1 from a fixed central point.

Theorem (Banach, Tarski). A unit ball in three-dimensional space can be split into five pieces that can be rigidly moved, rotated, and put back together to form two unit balls. 

The Banach–Tarski Theorem is often referred to as a paradox because it is counter-intuitive; for example, the theorem implies that, theoretically, one can split a solid glass ball into five pieces and then use the pieces to create two new glass balls of the same size as the original. However, in the proof of the theorem, the five pieces that are formed are not solids that have a measurable volume; they are five complex infinite sets of points. We repeat: there are infinite abstract objects that do not behave in the same way as finite concrete objects.

The conclusion of the Banach–Tarski Theorem does not refute the axiom of choice, and Cantor’s above theorem does not render the axioms of set theory false. Ever since the ancient Greeks, there have been results in mathematics that were once viewed as being counter-intuitive. Such results eventually become better understood and, as a result, become more intuitive themselves.

6. The Cumulative Hierarchy

Zermelo’s 1904 proof of the well-ordering theorem resembles von Neumann’s 1923 proof of the transfinite recursion theorem, a powerful tool in set theory. A formula \varphi(g,u) is said to be functional if and only if \forall g \exists ! u \varphi (g,u); that is, for all g, there is a unique u such that \varphi(g,u). Given a functional formula, \varphi(g,u), consider the class of ordered pairs

F = \{(g,u)\varphi(g,u)\}. 

Since \varphi(g,u) is functional, one can view F as a class function (that is, a functional class), and thus, F(x) is a set whenever x is a set. Let F|A denote the function obtained by restricting the domain of F to the set A. The replacement axiom implies that F|A is a set whenever A is a set.

Transfinite Recursion Theorem: Let \varphi(g,u) be a functional formula. Then there is a class function H such that, for all ordinals \beta, \varphi(H|\beta,H(\beta)). 

The transfinite recursion theorem is used to define what is commonly known as the cumulative hierarchy of sets and usually denoted by \{V_{\beta} : \beta \text{ is an ordinal}\}, which satisfies (see figure below)

  • V_{0} = \varnothing,
  • V_{\gamma + 1} = \wp (V_{\gamma}), for any ordinal \gamma,
  • V_{\beta} = \bigcup \{V_{\alpha} : \alpha < \beta\}, for any limit ordinal \beta.

 


One obtains \{V_{\beta} : \beta \text{ is an ordinal}\} by repeatedly applying the power set operation at successor ordinals and by taking the union of all the previous sets at limit ordinals. In particular, V_{0} = \varnothing and

V_{1} = \wp (V_{0})= \{ \varnothing,\{ \varnothing \} \}, \ldots , V_{\omega} = \bigcup \{ V_{n} : n < \omega\}, \ldots

 
The regularity axiom implies that for every set x, there exists an ordinal \alpha such that x \in V_{\alpha}. For this reason, the proper class V = \bigcup \{V_{\beta} : \beta \text{ is an ordinal}\} is called the universe of sets. It follows that each set V_{\beta} is in V and that all of the axioms in ZF are true in V. In addition, as one ascends the “ordinal spine,” one obtains sets V_{\gamma} of ever greater complexity that become better and better approximations to V (see above figure). This is confirmed by the reflection principle (see below) which, in essence, asserts that any statement that is true in V, is also true in some set V_{\beta}.

Let \varphi (v_{1}, \ldots , v_{n}) be a formula in the language of set theory with free variables v_{1}, \ldots , v_{n}. For any ordinal \alpha and x_{1}, \ldots , x_{n} \in V_{\alpha}, we write

(V_{\alpha}, \in) \vDash \varphi (x_{1}, \ldots , x_{n}) 

to mean that \varphi(x_{1}, \ldots ,x_{n}) is true in V_{\alpha}. The following theorem of ZF, due to Azriel Levy (Levy 1960) and Richard Montague (Montague 1961), implies that any specific truth that holds in V likewise holds in some initial segment V_{\beta} of V; in fact, it holds in unboundedly many initial segments.

Reflection Principle: Let \varphi(v_{1}, \ldots, v_{n}) be a formula and let \alpha be an ordinal. Then there is an ordinal \beta > \alpha such that, for all x_{1}, \ldots , x_{n} \in V_{\beta}, \varphi (x_{1}, \ldots ,x_{n}) is true in V if and only if (V_{\beta}, \in) \vDash \varphi (x_{1}, \ldots, x_{n}). 

As a corollary, for any finite number of formulas that hold in V, the reflection principle implies that all of these formulas also hold in some V_{\beta}. As noted before, there are an infinite number of axioms in ZF. Montague (Montague 1961) used the reflection principle to conclude that if ZF is consistent, then ZF is not finitely axiomatizable. Hence, ZF is not equivalent to any finite number of the axioms in ZF. This follows from Gödel’s second incompleteness theorem (see Kunen 2011, page 8), which implies that, if ZF is consistent, then one cannot prove, in ZF, the existence of a set model of ZF, that is, a set M such that (M,\in) \vDash \varphi, for every axiom \varphi in ZF.

7. Gödel’s Constructible Universe

As we have seen, the cumulative hierarchy of sets is constructed in stages. At successor stages, one adds all possible subsets of the previous stage and, at limit stages, one takes the union of all of the previously produced sets. To prove that the axiom of choice and the Continuum Hypothesis are consistent with ZF, Kurt Gödel (1938) constructed the “inner model” L of V commonly known as the universe of constructible sets. As we will see, L is a subclass of V. The idea behind Gödel’s construction of L is to modify the cumulative hierarchy structure so that the end result will produce a (smaller) class that satisfies ZF. For any set X, define D(X) to

D(X) = \{A \subseteq X: A is definable over (X,\in)\} 

where A is definable over (X,\in) means that there are x_{1},\ldots,x_{n} in X and a formula \varphi(v,x_{1},\ldots,x_{n}) such that, for all a in X,

a \in A if and only if (X,\in) \vDash \varphi (a,x_{1},\ldots,x_{n}). 

One can show, in ZF, that D is a class function (Moschovakis 2009, 8D). Using the transfinite recursion theorem and the “definable subset” operation D, Gödel defined the class \{L_{\beta} : \beta \text{ is an ordinal}\} by applying the operation D at successor ordinals and by taking the union of all of the previous sets at limit ordinals. The class \{L_{\beta} : \beta\text{ is an ordinal}\} satisfies the following (see figure below):

  • L_{0} = \varnothing,
  • L_{\gamma + 1} = D(L_{\gamma}), for any ordinal \gamma,
  • L_{\beta} = \bigcup \{L_{\alpha} : \alpha < \beta\}, for any limit ordinal \beta.

Consequently, at each successor stage of the construction, one extracts only the definable subsets of the previous stage. The proper class L = \bigcup\{L_{\beta} : \beta\text{ is an ordinal}\} is called the universe of constructible sets.

Assuming ZF, Gödel proved that L satisfies ZF, the axiom of choice, and the Continuum Hypothesis (Gödel 1990). Thus, if ZF is consistent, then so is the theory ZF+AC+CH. This result does not prove that the axiom of choice and the Continuum Hypothesis are true in V, but it does show that one cannot prove, in ZF, that either AC or CH is false.

The proper class L (with the \in relation restricted to L) is called an inner model, because it is a transitive class (a class that includes all of the elements of its elements), contains all of the ordinals, and satisfies all of the axioms in ZF.

Gödel’s notion of a constructible set has led to interesting and fruitful discoveries in set theory. By generalizing Gödel’s definition of L, contemporary set theorists have defined a variety of inner models that have been used to establish new consistency results (Kanamori 2003, pp. 34-35). Each of these inner models contains L as a subclass, and to understand the structure of these inner models, one must be familiar with the above definition of Gödel’s constructible sets. Moreover, a penetrating investigation into the structure of L has led researchers to discover many fascinating results about L and its relationship to the universe of sets V (Jech 2003).

8. Cohen’s Forcing Technique

In 1963, the mathematician Paul Cohen introduced an extremely powerful method, called forcing, for the construction of models of Zermelo-Fraenkel set theory. A model M of set theory is a transitive collection of sets in which the ZF (ZFC) axioms are all true, denoted by M \vDash ZF (M \vDash ZFC).

As discussed in section 7, Gödel showed that one cannot prove, in ZF, that either AC or CH is false. Cohen used his forcing technique to construct a model of ZFC in which the Continuum Hypothesis is false. Hence, one cannot prove, in ZFC, that CH is true. Thus, if ZFC is consistent, then CH is undecidable in ZFC. Cohen (1963) also showed that his technique of forcing can be used to produce a model of set theory in which ZF holds and the axiom of choice is false. Thus, AC is not provable in ZF. So, if ZF is consistent, then AC is undecidable in ZF.

Cohen’s idea was to start with a given set model M of ZFC (the ground model) and extend it by adjoining a “generic” set G to M where G \notin M. The resulting model M[G] (a generic extension of M) includes M, contains G, and satisfies ZFC. Cohen showed how to find a set G so that CH fails in M[G]. In a similar manner, Cohen was able to add a new set G to M such that there is an inner model of M[G] in which ZF holds and the axiom of choice is false. For his work, Cohen was awarded the Fields Medal in 1966. This award is considered to be the “Nobel Prize” of mathematics. Gödel stated that Cohen’s forcing method was “the greatest advance in the foundations of set theory since its axiomatization” (Kanamori 2003, page 32).

The discussion in the previous paragraph about M is neither complete nor entirely correct. In order to prove that the desired generic set G exists, Cohen, in fact, had to assume that M is a countable transitive set model of ZFC. Let us do the same. A partial order is a pair (P,\leq) such that P \neq \varnothing and \leq is a relation on P which is reflexive, antisymmetric, and transitive. By varying (P,\leq), one can obtain generic extensions that satisfy a wide variety of statements that are consistent with ZFC. Let (P,\leq) \in M be a partial order that is definable in M, and suppose that, in M, the definition of (P,\leq) and its properties are based only on the fact that M \vDash ZF. Since M is countable, there exists a generic set G \subseteq P (Kunen 2012, Lemma IV.2.3). Let us presume that (P,\leq) has the properties required to ensure that M[G] \vDash \varphi, where \varphi is a sentence in the language of set theory; for example, \varphi could be “not CH.” Hence, M[G] \vDash ZFC +~\varphi. Thus,

if M is a countable transitive set model of ZFC, then ZFC +~\varphi is consistent.

(6) 

To conclude that ZFC +~\varphi is consistent, it appears that one must first show that there exists a countable transitive set model of ZFC. However, by Gödel’s second incompleteness theorem, one cannot prove, in ZFC, that such a set model exists (unless ZFC is inconsistent). Is there a way around this difficulty? Note that there are finitely many axioms in ZFC such that if just these axioms hold in M, then one can still prove that M[G] \vDash \varphi (Kunen 2011).

We now discuss how the above argument used to establish (6) can be modified to correctly conclude that ZFC +~\varphi is consistent. Let T be a finite set of axioms in ZFC. Using the reflection principle, one can prove, in ZFC, that

there is a countable transitive set model M in which the axioms in T are true.

(7) 

For any finite set S of axioms in ZFC, the forcing method shows that there is a finite set T of axioms in ZFC such that S \subseteq T and

if M is a countable transitive set model in which the axioms in T hold, then there is a generic extension M[G] in which \varphi and the axioms in S hold.

(8) 

Since T is a finite set of axioms, we conclude from (7) that there is a countable transitive set model M that satisfies all of the axioms in T. Therefore, by (8), there is a generic extension M[G] that satisfies \varphi and all of the axioms in S. Since proofs are finite, we conclude that, in ZFC, one cannot prove \neg \varphi. Hence, ZFC +~\varphi is consistent, assuming that ZFC is consistent.

Cohen’s forcing technique is very versatile and has been used to show that there are many statements, both in set theory and in mathematics, that are undecidable (or unprovable) in ZF and ZFC. For example, in mathematics, the Hahn–Banach theorem is a crucial tool used in functional analysis. The proof of this theorem uses the axiom of choice. The forcing method has been used to show that Hahn–Banach theorem is not provable in ZF alone (Jech 1974). Moreover, using forcing results and the universe of constructible sets, Saharon Shelah (1974) has shown that a famous open problem in abelian group theory (Whitehead’s Problem) is undecidable in ZFC.

As suggested earlier, since essentially all mathematical concepts can be formalized in the language of set theory, set theory offers a unifying theory for mathematics. Thus, the theorems of mathematics can be viewed as assertions about sets. Moreover, these theorems can also be proven from ZFC, the Zermelo-Fraenkel axioms together with the axiom of choice. Cohen’s forcing method clearly shows that ZFC is an incomplete theory, as there are statements that cannot be resolved in it. This motivates the following question:

What path should be taken to try to settle the Continuum Hypothesis and other undecided statements in mathematics? 

In contemporary set theory, the most common answer to this question is called Gödel’s Program:

Search for new axioms, which, when added to ZFC, will determine the truth or falsity of unresolved statements. 

This program was inspired by an article of Gödel’s in which he discusses the mathematical and philosophical aspects of mathematical statements that are independent of ZFC (Gödel 1947). Sections 9 and 10 will discuss two directions that this program has taken: large cardinal axioms and determinacy axioms.

9. Large Cardinal Axioms

Roughly, a large cardinal axiom is a set-theoretic statement that asserts the existence of an uncountable cardinal \kappa that satisfies a particular property that implies that there is a set M such that (M,\in) is a model of ZFC; such a \kappa is called a large cardinal. Gödel’s second incompleteness theorem implies that, in ZFC, one cannot prove the existence of large cardinals. Thus, a large cardinal axiom is a “new axiom.” Most modern set theorists believe that the standard large cardinal axioms are consistent with ZFC.

Assuming ZFC, let us say that a cardinal \kappa is a strong limit cardinal if and only if, for every cardinal \lambda, if \lambda < \kappa, then 2^{\lambda} < \kappa. A cardinal \kappa is said to be inaccessible if and only if \kappa is uncountable, regular, and a strong limit cardinal. Recall that a cardinal \kappa is regular if \kappa is not the union of fewer than \kappa many sets of size each less than \kappa. If \kappa is an inaccessible cardinal, then, in ZFC, one can prove that (V_{\kappa},\in) is a model of ZFC (Kanamori 2003). Hence, such a \kappa is an example of a large cardinal and so, the statement “there exists an inaccessible cardinal” is a large cardinal axiom.

There are other large cardinal axioms. The description of these large cardinal axioms usually involves the concept of an elementary embedding of the universe, that is, a nontrivial truth preserving transformation from (V,\in) into (M,\in) where M is a transitive subclass of V. A theorem of Kenneth Kunen (Jech 2003) shows that there is no nontrivial elementary embedding of the universe V into itself. Thus, for any nontrivial truth preserving transformation from (V,\in) into (M,\in) where M is a transitive subclass of V, M \neq V. More specifically, a large cardinal axiom can be expressed as asserting that there exists a nontrivial (class) function

j: V \rightarrow M 

such that for each formula \varphi(v_{1},v_{2},\ldots,v_{n}) (in the language of set theory) and for all elements x_{1},\ldots,x_{n} in V,

(V,\in) \vDash \varphi(x_{1},\ldots,x_{n}) if and only if (M,\in) \vDash \varphi(j(x_{1}),\ldots,j(x_{n})). 

Since the embedding j is not the identity, there must be a least ordinal \kappa such that \kappa < j(\kappa). This ordinal is called the critical point of j and is denoted by \kappa = crit(j). It follows that \kappa is a cardinal; indeed, \kappa is the large cardinal that is confirmed by the existence of the embedding j.

A cardinal \kappa is said to be measurable if and only if there exists an embedding j: V \rightarrow M such that \kappa is the critical point of j. In this case, one can prove that V_{\kappa+1} \subseteq M. Therefore, there is some resemblance between M and V. Increasingly stronger large cardinal axioms demand a greater agreement between M and V. For example, if one requires that V_{\kappa+2} \subseteq M, then one obtains a stronger large cardinal axiom. For another example, a cardinal \kappa is said to be superstrong if and only if there is a transitive class M and a nontrivial elementary embedding j: V \rightarrow M such that \kappa = crit(j) and V_{j(\kappa)} \subseteq M. Even stronger large cardinal axioms are obtained by requiring greater and greater resemblance between M and V (Woodin 2011).

Large cardinal axioms are statements that assert the existence of large cardinals. These axioms are widely viewed as being very promising new axioms for set theory. Large cardinal axioms do not resolve the Continuum Hypothesis but they have led mathematicians to formulate conditions under which Cantor’s hypothesis is false (Woodin 2001, p. 688). As already mentioned, one cannot prove, in ZFC, that large cardinals exist. Yet, there is very strong evidence that their existence cannot be refuted in ZFC (Maddy 1988).

10. The Axiom of Determinacy

Descriptive set theory has its origins, in the early 20th century, with the theory of real-valued functions and sets of real numbers developed by Borel, Baire, and Lebesgue. These analysts, respectively, introduced

  • the hierarchy of Borel sets of real numbers,
  • the Baire hierarchy of real-valued functions,
  • Lebesgue measurable sets of real numbers.

Descriptive set theory extends the work of these mathematicians (Moschovakis 2009). Recall that \omega = \{0,1,2,3,4,\ldots\} is the set of natural numbers. Let ^{\omega}\omega be the set of all functions from \omega to \omega. The set ^{\omega}\omega is denoted by \mathbb{R} and is called Baire Space. \mathbb{R} is often referred to the set of reals; and if x \in \mathbb{R}, then x is called a real. \mathbb{R} is regarded as a topological space by giving it the product topology, using the discrete topology on \omega. The space \mathbb{R} is homeomorphic to the set of irrational numbers which is a subspace of the set of real numbers (Moschovakis 2009).

Descriptive set theory is a branch of set theory that uses set theoretic tools to investigate the structure of definable sets and functions over \mathbb{R}. One can identify the level of complexity of such definable sets of reals (Moschovakis 2009). Thus, there is a natural hierarchy on the definable subsets of \mathbb{R}, which, in increasing order of complexity, is called the projective hierarchy.

As a result of Gödel’s and Cohen’s work, it has been shown that many questions in descriptive set theory are not decidable in axiomatic set theory. For example, in 1938, Gödel showed that in L, the universe of constructible sets, there are projective sets of reals that are not Lebesgue measurable. In 1970, using the method of forcing, Robert Solovay showed that if there is an inaccessible cardinal, then ZFC is consistent with the statement that every projective set is Lebesgue measurable. Thus, one can neither prove nor disprove, in ZFC, the Lebesgue measurability of projective sets. Hence, in ZFC, the theory of projective sets is incomplete. For this reason, modern descriptive set theory focuses on new axioms; one such axiom concerns infinite games.

Gale and Stewart (1953) introduced the general concept of an infinite game of perfect information and began the study of these games. Other mathematicians then pursued this subject and discovered that it can be used to resolve problems in descriptive set theory.

We now turn to a description of infinite games and strategies. For each A \subseteq \mathbb{R}, we associate a two-person infinite game on \omega with payoff A, denoted by G_{A}, where players I and II alternately choose natural numbers a_{i} in the order given in the diagram:


After completing an infinite number of moves, the players produce the real

x =a_{0},a_{1},a_{2},\ldots⟩. 

Player I is said to win if x \in A, otherwise player II is said to win. As each player is aware of all the previous moves before making a next move, the game is called a game of perfect information. The game G_{A} is said to be determined if and only if either player has a “winning strategy,” that is, a function that ensures the player will win the game regardless of how the other player makes his or her moves. The Axiom of Determinacy (AD) is a regularity hypothesis about such games that states: For all A \subseteq \mathbb{R}, the game G_{A} is determined.

In the theory ZF+AD, one can resolve many open questions about the sets of real numbers. For example, one can prove Cantor’s original form of the continuum hypothesis: Every uncountable set of real numbers has the same cardinality as the full set of real numbers.

Moreover, it has been shown that the axiom of choice implies that AD is false; that is, using the axiom of choice, one can construct a set of reals A such that the game G_{A} is not determined. Thus, the axiom of determinacy is incompatible with the axiom of choice. However, it is not clear that one can establish, without the axiom of choice, the existence of a set of reals A such that the game G_{A} is not determined (Moschovakis 2009). Moreover, there are weaker versions of AD that are compatible with ZF together with a weaker choice principle called the axiom of dependent choices.

Axiom of Dependent Choices (DC). Let R be a relation on a nonempty set A. Suppose that for all x \in A there is a y \in A such that R(x,y). Then there exists a function f: \omega \rightarrow A such that, for all n \in \omega, R(f(n),f(n+1)).

Many mathematicians working in descriptive set theory operate within the background theory ZF+DC and the following determinacy axiom: For every projective set A, the game G_{A} is determined. This axiom is denoted by PD (projective determinacy). Under the theory ZF+DC+PD, the classic open questions about projective sets have been successfully addressed (Moschovakis 2009). In particular, this theory implies that all projective sets are Lebesgue measurable.

Generalizing the construction of the inner model L, one can construct the inner model L(\mathbb{R}), the smallest inner model that contains all the ordinals and all the reals. The set \wp(\mathbb{R}) \cap L(\mathbb{R}) can be viewed as a natural extension of the projective sets. The determinacy hypothesis denoted by AD^{L(\mathbb{R})}, asserts that AD holds in L(\mathbb{R}). Since the inner model L(\mathbb{R}) contains all of the projective sets, the assumption AD^{L(\mathbb{R})} implies PD.

There are very deep results that connect determinacy hypotheses and large cardinal axioms. In 1988, Martin and Steel, working in ZFC, identified a large cardinal axiom that implies PD. By assuming a stronger large cardinal axiom, Woodin, within ZFC, was able to prove that AD^{L(\mathbb{R})} holds and so, L(\mathbb{R}) satisfies ZF+AD. Moreover, PD and AD^{L(\mathbb{R})}, individually, imply the consistency of certain large cardinal axioms (Kanamori 2003). Investigating the relationships between determinacy hypotheses and large cardinals has become an important component of modern set theory.

11. Concluding Remarks

Set Theory is a rich and beautiful branch of mathematics whose fundamental concepts permeate all branches of mathematics. It is a most extraordinary fact that all standard mathematical objects can be defined as sets. For example, the natural numbers and the real numbers can be constructed within set theory. In addition, algebraic structures, functional spaces, vector spaces, and topological spaces can be viewed as sets in the universe of sets V. Consequently, mathematical theorems can be regarded as statements about sets. These theorems can also be proven from ZFC, the axioms of set theory. Thus, mathematics can be embedded into set theory.

Since all of conventional mathematics can be developed within set theory, one can view certain results in set theory as being part of metamathematics, the field of study within mathematics that uses mathematical tools to investigate the nature and power of mathematics. For example, using the forcing technique and inner models, it has been shown that there are mathematical statements that cannot be proven or disproven in ZFC. Thus, when a particular mathematical statement is unresolved, set theory can sometimes show that there is neither a proof nor a refutation of the statement in ZFC. As noted above, this situation has inspired the search for new set theoretic axioms.

Of course, the fact that set theory offers a foundation for mathematics indicates that set theory is a very important branch of mathematics. However, the concepts and techniques developed within set theory demonstrate that, in itself, set theory is a deep and exciting branch of mathematics with significant applications to other areas of mathematics. This success has inspired some philosophers of mathematics to direct their attention to the philosophy of set theory and the search for new axioms (Maddy 1988a, 1988b, 2011).

12. References and Further Reading

a. Primary Sources

  • Banach, S. and Tarski, A. 1924. “Sur la décomposition des ensembles de points en parties respectivement congruentes,” Fund. Math., 6, pp. 244–277.
  • Cantor, Georg. 1874. “Über eine Eigenschaft des Inbegriffes aller reellen algebraischen Zahlen,” Journal fur die reine und angewandte Mathematik (Crelle). 77, 258–262.
  • Cohen, Paul J. 1963. The independence of the axiom of choice. Mimeographed.
  • Cohen, Paul J. 1963a. “The independence of the continuum hypothesis I.” Proceedings of the U.S. National Academy of Sciences 50, 1143-48.
  • Cohen, Paul J. 1964. “The independence of the continuum hypothesis II.” Proceedings of the U.S. National Academy of Sciences 51, 105-110.
  • Cohen, Paul J. 1966. Set Theory and the Continuum Hypothesis, New York: Benjamin.
  • Cunningham, Daniel W. 2016. Set Theory: A First Course, New York: Cambridge University Press.
  • Dauben, Joseph W. 1979. Georg Cantor: his mathematics and philosophy of the infinite, Cambridge, Mass., Harvard University Press; reprinted: Princeton, Princeton University Press, 1990.
  • Dunham, William. 1990. Journey Through Genius: The Great Theorems of Mathematics (1st ed.). John Wiley and Sons.
  • Gale, D. and Stewart, F.M. 1953. “Infinite games with perfect information,“ Annals of Math. Studies, vol. 28, pp. 245–266.
  • Gödel, Kurt. 1947. “What is Cantor’s Continuum Problem?,” American Mathematical Monthly, vol. 54, pp. 515-525.
  • Gödel, Kurt. 1986. Collected Works, Volume I: Publications 1929–1936, (Solomon Feferman, editor-in-chief), Oxford University Press, New York.
  • Gödel, Kurt. 1990. Collected Works, Volume II: Publications 1938–1974, (Solomon Feferman, editor-in-chief), Oxford University Press, New York.
  • Gödel, Kurt. 1995. Collected Works, Volume III: Unpublished Essays and Lectures, (Solomon Feferman, editor-in-chief), Oxford University Press, New York.
  • Hilbert, David. 1923. On the infinite. Reprinted in the Philosophy of Mathematics: Selected Readings, 1983, edited by Paul Benacerraf and Hilary Putnam, pp. 83-201.
  • Jech, Thomas. 2003. Set theory. Third Edition, New York: Springer.
  • Jech, Thomas. 1973. The Axiom of Choice, North-Holland Publishing Company, Studies in logic and the foundations of mathematics, vol. 75, Amsterdam.
  • Kanamori A. 2003. The Higher Infinite. Perspectives in Mathematical Logic. Second edition. Berlin: Springer.
  • Kanamori A. 2012. Set theory from Cantor to Cohen, a book chapter in: Handbook of the History of Logic: Sets and Extensions in the Twentieth Century. Volume editor: Akihiro Kanamori. General editors: Dov M. Gabbay, Paul Thagard and John Woods. Elsevier BV.
  • Kunen, Kenneth. 2009. The Foundations of Mathematics. Studies in Logic, vol. 19. London: College Publications.
  • Kunen, Kenneth. 2011. Set Theory. Studies in Logic, vol. 34. London: College Publications.
  • Lévy, Azriel. 1960. “Axiom schemata of strong infinity in axiomatic set theory,” Pacific Journal of Mathematics, 10, pp. 223–238.
  • Maddy, Penelope H. 1988a. “Believing the axioms I.” The Journal of Symbolic Logic, 53(2), 481–511.
  • Maddy, Penelope H. 1988b. “Believing the axioms II.” The Journal of Symbolic Logic, 53(3), 736–764.
  • Maddy, Penelope H. 2011. Defending the axioms. On the philosophical foundations of set theory. Oxford: Oxford University Press.
  • Montague, Richard M. 1961. Fraenkel’s addition to the axioms of Zermelo. Essays on the foundations of mathematics, dedicated to A. A. Fraenkel on his seventieth anniversary, edited by Y. Bar-Hillel, E. I. J. Poznanski, M. O. Rabin, and A. Robinson for The Hebrew University of Jerusalem, Magnes Press, Jerusalem, and North-Holland Publishing Company, Amsterdam, pp. 91–114.
  • Moore, Gregory H. 2012. Zermelo’s Axiom of Choice: Its Origins, Development, and Influence. Mineola, NY: Dover Publications. Reprint of the 1982 original published by Springer.
  • Moschovakis, Yiannis. 2009. Descriptive Set Theory, 2nd edition, vol. 155 of Mathematical Surveys and Monographs, American Mathematical Society, Providence, 2009.
  • Solovay, Robert. 1970. “A model of set theory in which every set is Lebesgue measurable.” Annals of Mathematics, vol. 92, 1–56.
  • Shelah, Saharon. 1974. “Infinite abelian groups, Whitehead problem and some constructions.” Israel J. Math, vol. 18, 243–256.
  • Woodin, Hugh. 2001. “The Continuum Hypothesis, Part II.” Notices of the American Mathematical Society, vol. 48, no. 7.
  • Woodin, Hugh. 2011. Infinity, a book chapter in: Infinity: New Research Frontiers. Cambridge: Cambridge University Press.
  • Zermelo, Ernst. 2010. Collected Works. Gesammelte Werke. Volume I: Set Theory, Miscellania. Mengenlehre, Varia, edited by H.-D. Ebbinghaus and A. Kanamori, Springer, Berlin and Heidelberg, xxiv + 654 pp.

b.  Secondary Sources

  • Ebbinghaus, Heinz-Dieter. 2007. Ernst Zermelo. An Approach to His Life and Work. Berlin: Springer. In cooperation with Volker Peckhaus.
  • Enderton, Herbert B. 1977. Elements of Set Theory. New York: Academic Press.
  • Enderton, Herbert B. 2001. A Mathematical Introduction to Logic. 2nd edn. Burlington, MA: Harcourt/Academic Press.
  • Feferman, Solomon, Parsons, Charles and Simpson, Steven G. (Eds.). 2010. Kurt Gödel: essays for his centennial. Cambridge: Cambridge University Press.
  • Halmos, Paul R. 1974. Naïve  Set Theory. New York: Springer. Reprint of the 1960 edition published by Van Nostrand.
  • Hauser, Kai. 2006. “Gödel’s Program Revisited Part I: The Turn to Phenomenology.” Bulletin of Symbolic Logic, 12(4), 529–590.
  • Heller, Michael and Woodin, Hugh. (Eds.). 2011. Infinity: New Research Frontiers. Cambridge: Cambridge University Press.
  • Kanamori, Akihiro. 2012. “In praise of replacement.” Bulletin of Symbolic Logic, 18(1), 46–90.
  • Levy, Azriel. 2002. Basic Set Theory. Mineola, NY: Dover Publications. Reprint of the 1979 original published by Springer.
  • Moschovakis, Yiannis. 2006. Notes on Set Theory. 2nd edition. Undergraduate Texts in Mathematics. New York: Springer.
  • Potter, Michael. 2004. Set theory and Its Philosophy. New York: Oxford University Press.

c.  Internet Sources

Author Information

Daniel Cunningham
Email: cunnindw@buffalostate.edu
State University of New York Buffalo State
U. S. A.

Future Contingents

The riddle of the future bewilders human beings. On the one hand, we are inclined to think that future events are real in some sense, because we ask questions and make assertions about them. On the other hand, we are inclined to think that future events may depend on our choices, because we conceive of ourselves as free agents. These two inclinations seem to clash. If an event belongs to the future, then it is a fact that it will occur, and we cannot prevent it from occurring. Inversely, if we can prevent an event from occurring, then it cannot be a fact that it will occur. This apparent conflict is at the core of the debate on future contingents, a philosophical dispute that goes back to antiquity. Future contingents are sentences that concern future events that can occur or not occur. The question that started the debate—whether future contingents are true or false—is a question that has no clear answer, given that one may have different views about the truth and falsity of a sentence about the future. Yet an answer must be provided, and it cannot be just any answer. The constraints that define the problem of future contingents determine a restricted set of admissible answers, each of which gives rise to doubts, troubles, and complications.

Table of Contents

  1. The Problem
    1. Speaking about the Future
    2. The Sea Battle
    3. Bivalence, Excluded Middle, Fatalism
    4. Two Arguments
  2. Three Logical Options
    1. Neither Bivalence nor Excluded Middle
    2. Excluded Middle without Bivalence
    3. Both Bivalence and Excluded Middle
    4. Further Considerations
  3. Three Metaphysical Views
    1. Past, Present, and Future Entities
    2. No Future
    3. Many Futures
    4. One Future
  4. The Open Future
    1. Alternative Possibilities
    2. Indetermination
    3. Causal Power
    4. Other Definitions
  5. References and Further Reading

1. The Problem

a. Speaking about the Future

Tomorrow many things will happen. Some of them are things of which it seems correct to assert that they will happen, others are things of which it does not seem correct to assert that they will happen. For example, it seems correct to assert that the sun will rise. Alternatively, it does not seem correct to assert that exactly 3,245 pigeons will walk in Piazza San Marco.

The reason why in certain cases it seems correct to assert that things will go a certain way is that in those cases we take it to be true that things will go that way. As far as we know, the sun will rise tomorrow. Of course, we are not absolutely certain that it will. We might be wrong, due to unforeseen circumstances. However, the evidence that supports our prediction is solid.

Similarly, the reason why in certain cases it does not seem correct to assert that things will go a certain way is that in those cases we do not know whether things will go that way; that is, it may easily be false that things will go that way. We are not in a position to tell whether exactly 3,245 pigeons will walk in Piazza San Marco. As far as we know, the number of pigeons that will walk in Piazza San Marco may easily be bigger or smaller.

In this respect, assertions about the future resemble assertions about the past. The cases in which it seems correct to assert that things went a certain way are cases in which we take it to be true that things went that way. For example, it seems correct to assert that dinosaurs disappeared long time ago. Conversely, the cases in which it does not seem correct to assert that things went a certain way are cases in which we do not know whether things went that way. For example, it does not seem correct to assert that Caesar was annoyed by a mosquito while crossing the Rubicon.

More generally, the ordinary use of language suggests that assertions about the future, just like assertions about the past, can be correct or incorrect. Therefore, this suggests that future-tense sentences, like past-tense sentences, can be true or false. For example, “The sun will rise tomorrow” seems true. Conversely, “The sun will not rise tomorrow” seems false. Note that “The sun will rise tomorrow” does not express a necessary truth, that is, it is not a sentence such as “2+2=4.” Although unlikely, it is possible that it is false. Similarly, “The sun will not rise tomorrow” does not express a necessary falsity, that is, it is not a sentence such as “2+2=5.”

The problem discussed above, and that this article addresses, concerns future contingents; that is, sentences about future events that can occur or not occur. According to a line of thought that goes back to Aristotle, these sentences cannot be true or false. Hence, the linguistic analogy just considered is misleading: Assertions about the future are not like assertions about the past.

b. The Sea Battle

In chapter 9 of De Interpretatione, Aristotle asks whether it makes sense to say that a sentence about a future event that can occur or not occur is true or false. His answer is that it does not make sense, for if the sentence were true or false, then the event would be necessary or impossible:

Let us take, for example, a sea battle. It is requisite on our hypothesis that it should neither take place nor fail to take place tomorrow. These and other strange consequences follow, provided we assume in the case of a pair of contradictory opposites having universals for subjects and being themselves universal or having an individual subject, that one must be true, the other false, that there can be no contingency and that all things that are or take place come about in the world by necessity. (Aristotle, De interpretatione 18b23 ff)

Aristotle’s reasoning seems to be the following. Consider the sentences (1) and (2) as uttered today:

(1) There will be a sea battle tomorrow.

(2) There will not be a sea battle tomorrow.

If (1) were true, and (2) were false, then it would be settled today that there will be a sea battle tomorrow, so the sea battle would be necessary. Similarly, if (2) were true, and (1) were false, then it would be settled today that there will not be a sea battle tomorrow, so the sea battle would be impossible. Since the sea battle is contingent, that is, it is neither necessary nor impossible, this shows that (1) and (2) are neither true nor false.

For Aristotle, the claim that (1) and (2) are neither true nor false is consistent with the plausible assumption that the disjunction formed by (1) and (2) is true:

(3) Either there will be a sea battle tomorrow or there will not.

Aristotle seems to think that (3) expresses a necessary truth, although the same does not hold for (1) and (2) taken separately:

That every thing is or is not is necessary, and also that it will be or it will not be; however, certainly not that, taken separately, one or the other is necessary. I say for example that it is necessary that either there will be a sea battle tomorrow or there will not be a sea battle tomorrow, but it is neither necessary that a sea battle will occur tomorrow nor that it will not occur. Rather, it is necessary that it will occur or not. (Aristotle, De Interpretatione, 19a25-30)

Another aspect of Aristotle’s point is that the claim that (1) and (2) are neither true nor false does not reduce to the observation that we do not know whether there will be a sea battle tomorrow. Of course, we do not know whether there will be a sea battle tomorrow. The absence of truth or falsity that Aristotle ascribes to (1) and (2), however, is independent of our epistemic condition. The problem of future contingents concerns truth rather than knowledge. Compare (1) with “There was a sea battle yesterday.” We can easily imagine a situation in which one does not know whether a sea battle occurred the day before. Despite this, independently of whether one knows it or not, it seems right to say that “There was a sea battle yesterday” is either true or false. Its truth or falsity depends on what happened the day before. Aristotle suggests that (1) differs in this respect, because there is nothing that can make it true or false.

c. Bivalence, Excluded Middle, Fatalism

The problem of future contingents stems from the combination of three ingredients. Two of them are fundamental logical principles, namely, bivalence and excluded middle. The third is a controversial metaphysical doctrine, namely, fatalism.

Bivalence is the principle according to which truth and falsity are reciprocally exclusive and jointly exhaustive values. Classical logic relies on bivalence, in that it assumes that every sentence is true or false. If the letter p is used as a schematic expression that stands for any sentence, this assumption can be stated as follows:

(B) Either “p” is true or “p” is false.

For example, “p” can be replaced with “Snow is white,” “Snow is green,” or any other sentence.

Here, “any other sentence” includes not only simple sentences, such as those just considered, but also complex sentences, such as “Snow is not white,” “If snow is green, then it is not white,” and “Either snow is white or it is green.” The last three sentences are respectively a negation, a conditional, and a disjunction, in that they are formed by means of the connectives “not,” “if/then,” and “or.” In classical logic, complex sentences formed in this way are treated as truth functions of their constituents, which means that their truth or falsity is determined by the truth and falsity of their constituents. More precisely, the negation of a sentence is true if and only if the sentence is false, a conditional is true if and only if it is not the case that its antecedent is true and its consequent is false, and a disjunction is true if and only if at least one of its disjuncts is true. Thus, bivalence is consistent with the assumption that some connectives—such as “not,” “if/then,” and “or”—are truth-functional, that is, that the complex sentences formed by means of these connectives are truth functions of their constituents.

Excluded middle is the principle according to which every disjunction formed by a sentence and its negation is true. For instance:

(E) Either p or not-p

Classical logic justifies (E) in that it assumes that negation and disjunction are defined in the way explained. From that definition, it turns out that, no matter whether it is the case that p, one of the disjuncts of (E) must be true.

Finally, fatalism is the doctrine according to which nothing is contingent, that is, everything is either necessary or impossible:

(F) Either it is necessary that p or it is impossible that p

From (F) we get that if p, then it is necessary that p, and if not-p, then it is impossible that p. Suppose that p. Then the second disjunct of (F) is false, and hence the first must be true. Suppose that not-p. Then the first disjunct of (F) is false, and hence the second must be true. Note that here “necessary” and “impossible” are understood as “necessary given our past and our present” and “impossible given our past and our present,” that is, without taking into account what could happen if our past and our present were different. The problem of future contingents concerns future possibilities. It does not concern past or present possibilities.

The thesis that nothing is contingent is sometimes called “necessitarianism,” and the term “fatalism” often expresses the view that no one has free will, understood as the ability to do otherwise than what one actually does. However, even when a distinction is drawn between necessitarianism and fatalism, it is usually taken for granted that there is a close connection between them: If we are unable to do otherwise than we actually do, it is because what we do is necessary. In any case, independently of what “fatalism” means, (F) is controversial because it is at odds with free will. If nothing is contingent, then it is hard to see how one can be free to choose one course of action rather than another.

d. Two Arguments

The reasoning that emerges from the first quote in section 1.b suggests that bivalence entails fatalism. Suppose that (1) is either true or false. Assuming that the truth of (1) makes the sea battle necessary, and that the falsity of (1) makes the sea battle impossible, it follows that either it is necessary or it is impossible that there will be a sea battle. The argument may be phrased in schematic form as follows:

[BF]

(B) Either “p” is true or “p” is false.

(A1) If “p” is true, then it is necessary that p.

(A2) If “p” is false, then it is impossible that p.

So, (F) Either it is necessary that p or it is impossible that p.

[BF] is valid, in that its conclusion follows from its premises. Suppose that (B), (A1), and (A2) are true. Then one of the disjuncts of (B) is true. This means that either the antecedent of (A1) or the antecedent of (A2) is true, hence that either the consequent of (A1) or the consequent of (A2) is true. So (F) must be true. If one accepts the premises of a valid argument, one is compelled to accept its conclusion. Therefore, one cannot accept (B), (A1), and (A2) without accepting (F). By contraposition, if one takes (F) to be false, one must think that there is something wrong in the premises of [BF]. Aristotle thinks that the mistake lies in (B), as he takes (A1) and (A2) to be true.

Since (B) and (E) are distinct logical principles, rejecting (B) does not amount to rejecting (E). Aristotle is clearly aware of this fact, as shown by the second quote in section 1.2. However, there is another fact that he does not take into account, namely, that if one grants two apparently innocuous assumptions about truth and falsity, one can get bivalence from excluded middle. The argument is the following:

[EB]

(E) Either p or not-p.

(A3) If p, then “p” is true.

(A4) If not-p, then “p” is false.

So, (B) Either “p” is true or “p” is false.

[EB] is valid, as is [BF]. Here, again, the first premise is a disjunction, the second and third premises are conditionals in which the two disjuncts occur as antecedents, and the conclusion is a disjunction formed by the two consequents. This means that if (E), (A3), and (A4) are true, then (B) must be true.

Now the problem of future contingents becomes evident. According to [BF], bivalence entails fatalism. According to [EB], excluded middle entails bivalence. Therefore, from the combination of [EB] and [BF] we get that excluded middle entails fatalism. Since fatalism is unacceptable—or so assume Aristotle and many others after him—there must be something wrong with at least one of the premises of [BF] and [EB]. Determining which is the problem. Questions arise as to whether bivalence and excluded middle are sound logical principles, whether bivalence really entails fatalism, and whether excluded middle really entails bivalence. To solve the problem of future contingents is to provide satisfactory answers to these questions.

2. Three Logical Options

a. Neither Bivalence nor Excluded Middle

Now we will consider three distinct theses about bivalence and excluded middle, which constitute the main logical options available to solve the problem of future contingents. These three theses share two basic assumptions: One is that fatalism is wrong, and the other is that [BF] and [EB] are valid. Thus, they agree that (E) and (A1)-(A4) are not all true. If (E) and (A1)-(A4) were all true, on the second assumption it would follow that (F) is true, contrary to the first assumption.

The first option—option 1—is to deny both bivalence and excluded middle. According to this option, bivalence does not hold. Since (A1) and (A2) are true, if (B) were true, then (F) would be true. Excluded middle does not hold either, for (A3) and (A4) are just as true as (A1) and (A2). So, if (E) were true, then (B) would be true as well. In other terms, [BF] and [EB] are alike in that their first premise is false.

In the debate over future contingents, the theory that best expresses option 1 is Lukasiewicz’s three valued logic (Lukasiewicz 1970). This theory, which intends to provide a coherent interpretation of Aristotle, shares with classical logic the tenet of truth-functionality; that is, it takes for granted that the value of a complex sentence is determined by the values of its constituents. However, it differs from classical logic in that it contemplates three values instead of two: truth, falsity, and indeterminacy.

Lukasiewicz rejects bivalence because he thinks that some sentences are indeterminate. A sentence is indeterminate when the way things are does not make it true and does not make it false. For example, (1) is indeterminate, because no fact or event today can make it true or false.

Lukasiewicz also rejects excluded middle. In his logic, the negation of an indeterminate sentence is itself indeterminate. For example, (2) is indeterminate, for its truth would amount to the falsity of (1), and its falsity would amount to the truth of (1). Moreover, a disjunction is indeterminate if both its disjuncts are indeterminate. So (3) is indeterminate. In general, every disjunction formed by an indeterminate sentence and its negation turns out indeterminate.

The rejection of bivalence is an essential feature of any three-valued logic, for what defines such a logic is just the hypothesis that there are three values instead of two. The rejection of excluded middle, instead, is not essential in this sense. Assuming that there are three values, and that some connectives are truth-functional, there is no unique way to define those connectives. In particular, negation and disjunction could be so defined as to validate excluded middle.

However, it seems that there are no independent reasons for changing the definitions of negation and disjunction proposed by Lukasiewicz. First, it would make little sense to stipulate that the negation of an indeterminate sentence is true rather than indeterminate. Since (1) and (2) are about the same event, it is hard to see how (2) can be true if (1) is indeterminate. Second, it would make little sense to stipulate that a disjunction formed by two indeterminate sentences is true rather than indeterminate, because in that case, “Either there will be a sea battle tomorrow or it will rain tomorrow” would be true, which seems unreasonable.

On the other hand, from the perspective of a three-valued logic it would be impermissible to claim that some negations of indeterminate sentences are indeterminate while others are true, or that some disjunctions formed by indeterminate sentences are indeterminate while others are true. This would amount to giving up truth-functionality, which is essential to any such logic. To assume that “not” and “or” are truth functional is to assume that the value of a negation or a disjunction—no matter whether truth, falsity, or indeterminacy—solely depends on the value of its constituents.

Thus, although Lukasiewicz’s logic is not the only three-valued logic that we can imagine, it is reasonable to think that no other three-valued logic can provide a better account of future contingents. Accordingly, we assume that three-valued logic invalidates both bivalence and excluded middle.

One merit of option 1 is that it accepts [EB]. This is plausible, given that [EB] is valid and that (A3) and (A4) express principles about truth and falsity that seem evident. According to [EB], if one accepts (E), one must also accept (B). So, by contraposition, if one rejects (B), one must also reject (E).

The rejection of excluded middle, however, constitutes a flaw of option 1, for it is hard to believe that a disjunction formed by a sentence and its negation, such as (3), is not true. Even though we do not know what will happen tomorrow, it seems certain that either there will be a sea battle tomorrow or there will not.

Another problem that affects option 1—the assertion problem—derives from the rejection of bivalence. As we have seen in section 1.a, the ordinary use of language suggests that some assertions about the future are correct, and hence that some future contingents are true. For example, “The sun will rise tomorrow” seems true. If all future contingents are indeterminate, however, this sentence cannot be true, so it is not clear why one should assert it. Those who adopt option 1 must explain how we can make apparently correct assertions by using future contingents.

b. Excluded Middle without Bivalence

The second option—option 2—is to deny bivalence but accept excluded middle. According to this option, bivalence entails fatalism, but excluded middle does not entail fatalism, because excluded middle does not entail bivalence. In other words, the argument that does not work is [EB], for one can accept (E) without accepting (B). This is the most plausible reading of Aristotle, advocated by Boethius, Peter Auriol, and many other scholars.

To justify option 2, one must explain why [EB] does not work. That is, one must explain why (A3) and (A4) are not true. Supervaluationism, a theory elaborated by Thomason (1984) on the basis of ideas expressed by Prior (1967) and Van Fraassen (1966), provides one coherent explanation. Supervaluationism rests on the assumption that future-tense sentences can be evaluated as true or false relative to possible futures. For example, in some possible futures there will be a sea battle tomorrow, while in others there will be peace. (1) is true in a future of the first kind, while it is false in one of the second kind. According to supervaluationism, to ask whether a future-tense sentence is true or false is to ask whether it is true or false in any possible future. This idea can be phrased in a precise way if we define a “history” as a whole possible course of events, that is, a course of events that includes a possible future, and we assume that, for any future contingent “p,” uttered at a moment m, there is a set of accessible histories such that in each of them “p” is either true or false at m. Truth in the non-relative sense—truth simpliciter—is defined in terms of truth relative to histories: “p” is true at m if and only if it is true at m in all the histories of the set. Similarly, “p” is false at m if and only if it is false at m in all the histories of the set. The name of the theory comes from this idea. If we call “valuation” each attribution of value to a sentence relative to a history, we can call “supervaluation” an attribution of value to the sentence that takes into account all the valuations.

Supervaluationism draws a principled distinction between bivalence and excluded middle. Consider (1). Since (1) is true today in some histories and false today in other histories, (1) is neither true nor false today. The same goes for (2). In general, future contingents are neither true nor false, because they are true in some histories and false in others. Therefore, bivalence does not hold. Now consider (3). In every history, either the first disjunct is true today, or the second disjunct is true today. Consequently, (3) is true today. In general, a disjunction formed by a sentence and its negation is always true. Therefore, excluded middle holds.

Note that this account of excluded middle involves an essential duality with respect to truth-functionality. There is a sense in which (3) is a truth function of its constituents, the sense in which, for any history h, (3) is true in h if and only if one of its disjuncts is true in h. There is also a sense in which (3) is not a truth function of its constituents, the sense in which (3) is true simpliciter even though neither of its disjuncts is true simpliciter. Truth-functionality holds at the level of truth relative to histories, but not at the level of truth simpliciter. This makes supervaluationism a partially non-classical theory.

Now let us go back to (A3) and (A4). Supervaluationism provides a motivation for rejecting (A3). Suppose that “p” is a future contingent that is true at m in h. Then the antecedent of (A3) is true at m in h. Its consequent, however, is not true at m in h, because in order to be true at m in h, “p” should be true at m in all histories. Therefore, (A3) is not true at m in h. It follows that (A3) is not true at m. A similar reasoning motivates the rejection of (A4). Suppose that “not-p” is a future contingent that is true at m in h. Then the antecedent of (A4) is true at m in h. Its consequent, however, is not true at m in h, because “p” is not false at m in all histories. So (A4) is not true at m in h. It follows that (A4) is not true at m.

Although this explanation is consistent with the supervaluationist definition of truth, it is not entirely satisfactory, or so one might argue. The rejection of (A3) and (A4) speaks against supervaluationism, for (A3) and (A4) are very plausible assumptions. It seems trivial that “Snow is white” is true if snow is white, and that “Snow is white” is false if snow is not white. Just because it seems trivial, it should turn out true.

Independently of (A3) and (A4), the supervaluationist definition of truth may cause some perplexity. Some might contend that this definition mistakenly identifies truth with necessity. To say that “p” is true is not the same thing as to say that it is necessary that p, or so it appears. Imagine that Bob and Rob are at the racecourse and that Bob bets on Frisco. Bob and Rob are indeterminists, so they believe that it is possible that Frisco will win and that it is possible that Frisco will not win. In the middle of the race, Rob says to Bob: “Don’t worry, Frisco will win,” to which Bob replies, “I really hope that’s true.” Presumably, what Bob hopes is not that his philosophical convictions are false; that is, he does not hope that Frisco’s victory is necessary. To hope that Frisco will win is not the same thing as to hope that it is necessary that Frisco will win. It is consistent to hope that Frisco will win and think that it is possible that Frisco will not win. It thus seems that the truth of the sentence uttered by Rob does not amount to its truth in all histories.

The intuitive difference between the claim that “p” is true and the claim that it is necessary that p becomes even clearer when we consider retrospective attributions of truth. Suppose that Frisco really wins and that at the end of the race Bob exults: “You were right! It was true!” What Bob wants to say is that the sentence uttered by Rob during the race was true. However, the supervaluationist definition of truth entails that that sentence was neither true nor false, as it was false in some histories. This seems wrong, because the truth that Bob retrospectively attributes to the sentence uttered by Rob does not rule out its possible falsity. It is consistent to think that what Rob said was true and that, in the moment in which he said it, it was possible that Frisco would not win. Again, it seems that the truth of the sentence uttered by Rob does not amount to its truth in all histories.

Supervaluationism is not the only theory in line with option 2. Another theory, advocated by Belnap and others (Belnap, Perloff, and Xu 2001), implies that there is no such thing as truth simpliciter. Future contingents are true or false only relative to histories, because it is only relative to histories that they express a determinate content. Suppose that (1) is uttered today. Since at the moment of the utterance different futures are possible, each of which includes a different tomorrow, the word “tomorrow” in (1) does not denote a determinate moment, which means that (1) does not express a determinate content. Therefore, it makes no sense to ask whether (1) is true or false today. The only meaningful question that can be asked is whether (1) is true or false relative to a given history. This theory shares with supervaluationism the assumption that future contingents can be evaluated as true or false relative to possible futures, but does not identify truth simpliciter with truth in all histories, because it rejects the very idea of truth simpliciter.

MacFarlane (2003, 2008) has proposed a third theory. Just like Belnap and others, MacFarlane claims that there is no such thing as truth simpliciter. In this case, the motivation provided is that a parameter of evaluation other than the history has to be taken into account. According to MacFarlane, the value of a future contingent uttered at a given moment can vary depending on the context of assessment, that is, on the moment in which it is evaluated. Suppose that (1) is uttered today and that tomorrow there is a sea battle. Today, at the moment of the utterance, (1) is neither true nor false. Tomorrow, however, in the middle of the sea battle, (1) is true. Consequently, the same sentence, as uttered at a given moment, can have different values in different contexts of assessment.

Both theories reject bivalence: Future contingents are not true or false, because they are not true or false in some absolute sense. Moreover, they both preserve excluded middle, because they make it valid in a relative sense. For example, (3) is always true today, in that it is true today in every history or in any context of assessment. These two theories thus have much in common with supervaluationism.

Leaving specific problems aside, both theories considered run into the assertion problem, as they reject bivalence. If one claims that “The sun will rise tomorrow” is neither true nor false, independently of the motivation adopted, one has to explain why it seems correct to assert this sentence.

To conclude, option 2 differs from option 1 in that it saves excluded middle, which is a merit. Its main flaws are essentially two. One is that it must provide a plausible definition of truth that—among other things—enables us to explain what is wrong with [EB]. The other is that it must address the assertion problem, which it shares with option 1.

c. Both Bivalence and Excluded Middle

The third option—option 3—is to accept both bivalence and excluded middle. According to this option, excluded middle entails bivalence, but bivalence does not entail fatalism. In other terms, the argument that does not work is [BF], for one can accept (B) without accepting (F).

To justify option 3, one must explain why [BF] does not work, that is, it must be explained why (A1) and (A2) are not true. One way to do so is to endorse Ockham’s idea that one of the possible futures is the actual future, that is, the way things will actually go. In his Tractatus de praedestinatione et praescientia Dei respectu futurorum contingentibus, which aims to explain how divine foreknowledge is compatible with the contingency of events, Ockham draws a distinction between truth and determinate truth. The former is understood as truth in the actual future, the latter is understood as truth in all possible futures. According to Ockham, future contingents are true or false, even though they are not determinately true or determinately false (1978).

The distinction between truth and determinate truth—which has been defended by Von Wright (1984), Lewis (1986) and Horwich (1987), among others—can be illustrated by means of the two examples considered in section 2.b. Suppose, as before, that Rob says to Bob, “Don’t worry, Frisco will win!” and that Bob replies, “I really hope that’s true.” As we have seen, it seems that Bob’s hope is not that Frisco’s victory is necessary. One obvious candidate for what he does hope for is the following: What Bob hopes is that Frisco will actually win, namely, that the possible future that will become reality is a future in which Frisco wins. Now, suppose that Frisco really wins and that Bob says to Rob: “You were right! It was true!” As we have seen, it seems correct to say that the sentence uttered by Rob was true, even though it was possible that Frisco would not win. If the truth of that sentence does not amount to its truth in all possible futures, it is unclear what it amounts to. Again, one obvious answer is that it amounts to the fact that Frisco actually won. Thus, a sentence can be true without being determinately true, if it is true in the actual future but false in some other future.

The theory that we will call Ochkamism is inspired by Ockham in that it defines truth in terms of the actual future. Ockhamism, just like the theories considered in section 2.b, adopts a relative notion of truth: A future contingent “p,” uttered at a moment m, can be evaluated as true or false in a set of accessible histories. Truth in the non-relative sense—truth simpliciter—is defined in terms of this notion: “p” is true at m if and only if “p” is true at m in the actual history. Similarly, “p” is false at m if and only if “p” is false at m in the actual history (Øhrstrøm 2009; Rosenkranz 2012; Iacona 2013, 2014; Wawer 2014; Malpass and Wawer 2018).

If truth is defined in terms of the actual history, then truth does not entail determinate truth. This is why Ockhamism rejects (A1) and (A2). Suppose that “p” is true at m in the actual history. In this case, the antecedent of (A1) is true at m, while its consequent is false at m. Similarly, suppose that “p” is false at m in the actual history. In this case, the antecedent of (A2) is true at m, while its consequent is false at m.

This prompts the question of whether it makes sense to say that one of the possible futures is the actual future. The very idea of a unique actual future may easily raise doubts and misgivings. If one among the many possible futures is the actual future, it is unclear how the other futures can be equally possible, given that they will not become real. In other words, it seems impossible that what will happen is not predetermined. In order to adequately justify the distinction between truth and determinate truth, some convincing responses to these questions must be provided.

In sum, option 3 rescues bivalence and excluded middle, in accordance with classical logic. Moreover, it does not run into the assertion problem, because it implies that some future contingents are true, so it can explain the apparent correctness of some assertions about the future. The most problematic aspect of this option is the very idea of the actual future.

d. Further Considerations

The three logical options considered so far define the main positions within the debate on future contingents. Since these options do not exhaust the logical space of possibilities, this section dwells briefly on the only combination this article has not considered, namely, bivalence without excluded middle.

One way to give substance to this option, which comes from Pierce as interpreted by Prior, is the following: Future contingents are all false, because they describe future events as inevitable. For example, (1) and (2) are both false, because (1) says that there will necessarily be a sea battle tomorrow, while (2) says that there cannot be a sea battle tomorrow. Therefore, excluded middle does not hold: (3) is false, for both its disjuncts are false. Yet bivalence holds, because every sentence, including future contingents, is either true or false (Øhrstrøm and Hasle 1995; Prior 1967; Todd 2016).

The same problems that affect option 1 affect this position. First, the rejection of excluded middle is difficult to accept. (3) seems true, not false. Second, the assertion problem is still there. If all future contingents are false, then “The sun will rise tomorrow” cannot be true, in spite of the fact that it seems correct to assert it.

Independently of these two problems, the idea that all future contingents are false gives rise to further troubles. Consider (1) and (2). On the assumption that (2) is the negation of (1), as its syntactic structure suggests, it is unreasonable to think that (1) and (2) are both false. So, the most plausible way to claim that (1) and (2) are both false is to say that (2)—contrary to what its syntactic structure suggests—is not the negation of (1). The negation of (1) would rather be “It is not the case that there will be a sea battle tomorrow.” On the hypothesis that (2) and “It is not the case that there will be a sea battle tomorrow” express different contents, it is consistent to say that the former is false while the latter is true. Note, however, that this way, “Either there will be a sea battle tomorrow or it is not the case that there will be a sea battle tomorrow” turns out true. Thus, there is a clear sense in which excluded middle holds: If “It is not the case that there will be a sea battle tomorrow” is the negation of (1), the sentence that instantiates (E) is “Either there will be a sea battle tomorrow or it is not the case that there will be a sea battle tomorrow,” not (3). Moreover, we still need an explanation of why (2) and “It is not the case that there will be a sea battle tomorrow” express different contents, given that they seem to say exactly the same thing.

These troubles explain the scarce popularity of the option just considered. The debate on future contingents almost never sees the acceptance of bivalence combined with the rejection of excluded middle, because most thinkers take it for granted that bivalence is at least as controversial as excluded middle.

3. Three Metaphysical Views

a. Past, Present, and Future Entities

So far, we have considered three logical options that differ with respect to bivalence and excluded middle. Now we will address the key metaphysical issue that underlies the problem of future contingents: what there is in front of us.

Let us first introduce four basic ontological conceptions of time, that is, four conceptions of the existence of past, present, and future entities. Past entities and future entities resemble present entities in some respects but not in others. On the one hand, there is a sense in which Caesar is like us and unlike the Abominable Snowman: Ceasar was a real person, while the Abominable Snowman has never existed. The same goes for future children, who will be real persons just like us. On the other hand, there is a sense in which Caesar is not like us: We are here, while he is no longer here. Similarly, future children are not here yet. The four conceptions considered in this article weigh these similarities and differences in different ways.

Presentism is the conception according to which only present entities exist. We exist, but Ceasar and future children do not exist. Existing and being present are the same thing. Imagine an incredibly big and incredibly thin slice of salami. The slice is the present, and we are in it. Behind us there is nothing, because the past does not exist, and ahead of us there is nothing, because the future does not exist. This conception—which is defended by Prior (1970), Bigelow (1996), and Bourne (2006), among others—is represented in figure 1.Figure 1

Figure 1: Presentism

The growing block theory, alternatively, is the conception according to which past and present entities exist, but future entities do not exist. Ceasar exists, we exist, but future children do not exist. This conception—defended by Broad (1923), Tooley (1997), and Correia and Rosenkranz (2018), among others—describes reality as a totality that constantly increases as time passes. In figure 2, the slice of salami that represents the present is attached to the portion of salami that precedes it, the past.Figure 2

Figure 2: Growing block

A third conception that is purportedly opposite to the growing block theory is the shrinking block theory. According to this theory, which is not widely accepted (though see, for example, Casati and Torrengo 2011), present and future entities exist, but past entities do not exist. We exist, future children exist, but Ceasar does not exist. Reality is what is left, so to say, and the future is constantly eroded as time passes. In figure 3, the slice of salami that represents the present is attached to the portion of salami that follows it, the future.Figure 3

Figure 3: Shrinking block

Finally, eternalism is the view according to which past, present, and future entities exist. We exist, and the same goes for Ceasar and future children. This conception is defended by Williams (1951), Taylor (1955), Smart (1963), Putnam (1967), Mellor (1998), and Sider (2001), among others. In figure 4, the slice of salami that represents the present is part of a whole salami, a history, which may be conceived of as a sequence of moments.

Figure 4

Figure 4: Eternalism

While the first three conceptions are essentially dynamic, in that they imply that the passage of time is metaphysically real, eternalism may be understood either dynamically, assuming that the present really moves along the line of time, or statically, assuming that the experience of the passage of time is merely illusory. On both interpretations, the idea that underlies eternalism is that temporal relations are somehow similar to spatial relations. For example, Turin, Milan and Venice are located on three points ordered along the west-east axis. Although each of these three cities offers a distinct perspective on the other two, the spatial relations among them—the order in which they are located along the west-east axis—do not vary with the point of observation. According to eternalism, the same goes for temporal relations. Being present is like being in Milan. There is no ontological difference between Caesar, us, and future children, just as there is no ontological difference between Turin, Milan, and Venice (see the time).

The classification just presented will help with understanding the three metaphysical views considered in the next three sections. As these sections show, these three views can be associated with options 1-3, although there is no necessary connection between them. Each view provides a distinct answer to the question of what is there ahead of us.

b. No Future

The first view—the no-future view—says that there is absolutely nothing ahead of us: The future does not exist. Certainly, many things will happen, and it makes perfect sense to talk about such things. However, what will happen will exist only when it will happen; it does not exist now. When it will happen, it will no longer be future.

Presentism and the growing block theory entail the no-future view. Although these two conceptions differ with respect to the question of whether the past exists, they agree on the non-existence of the future. By contrast, the shrinking block theory and eternalism contradict the no-future view. Although these two conceptions differ with respect to the question of whether the past exists, they agree on the existence of the future. Therefore, the no-future view can be maintained either in a presentist perspective or in a growing-block perspective.

Of the three logical options considered in section 2, the one that best suits the no-future view is option 1. If the future does not exist, there is nothing that can make future-tense sentences true or false. For example, there is nothing that can make (1) and (2) true or false. It is thus sensible to claim that future-tense sentences violate bivalence. This is probably what Lukasiewicz had in mind, although he did not explicitly address the distinction between presentism and growing block theory.

Perhaps it is also sensible to claim that future-tense sentences violate excluded middle. If nothing can make true (1) or (2), the same goes for (3). The “perhaps” is due to the fact that the inference from the absence of truth of (1) and (2) to the absence of truth of (3) requires a further constraint that plays a crucial role in three-valued logic, namely, truth-functionality. Assuming that a disjunction is true only if one of its disjuncts is true, from the absence of truth of (1) and (2) we can infer the absence of truth of (3). Without that assumption, instead, the inference is not legitimate. As we have seen in section 2.2, supervaluationism differs from three-valued logic precisely in that it gives up truth-functionality to save excluded middle.

The no-future view—especially in the growing block version—provides a metaphysical substratum for the idea that future-tense sentences are sui generis from the logical point of view. The difference at the logical level can be explained by a difference at the metaphysical level: The past and the present exist, whereas the future does not exist. This is not to say that, strictly speaking, the no-future view entails that idea. For example, Correia and Rosenkranz (2018) argue that the growing block theory is consistent with bivalence.

c. Many Futures

The second and the third view differ from the first in that they entail the existence of future entities. Although this makes them compatible both with the shrinking block theory and with eternalism, they are usually framed in an eternalist perspective. In such a perspective, the contingency of a future event cannot be conceived of in terms of absence, as in the no-future view, because an event cannot be future without existing. Rather, it will be conceived of in terms of presence in some but not in all possible futures. This is why the second and the third view contemplate a plurality of histories. A history is a possible world, that is, a totality of past, present, and future entities that is completely defined in its spatial and temporal properties.

The second view—the many-futures view—says that there are many futures ahead of us, that is, many possible continuations of the present. These continuations are like branches that depart from the same trunk, and they are metaphysically on a par, that is, they all exist and they are all actual (or none of them is). Figure 5 illustrates the many-futures view by recalling the salami analogy. The slice is the present, as in the previous figures, but there are two portions of salami on the right, that is, two possible continuations of the present. Each of these two portions, together with the left portion, forms a whole salami. Therefore, the slice belongs to two distinct salami.

Figure 5

Figure 5: Branching

The idea illustrated in figure 5 can be represented in a more abstract way by using simple lines. In figure 6, h1 and h2 are histories, while m0, m1 and m2 are moments. m0 belongs both to h1 and to h2. Instead, m1 belongs only to h1, and m2 belongs only to h2. While m0 precedes both m1 and m2, m1 and m2 are unrelated, in that neither of them precedes the other. Diagrams of this kind, introduced by Kripke and Prior, are often employed in temporal logic to represent the set of future possibilities (Prior 1967).

Figure 6

Figure 6: One past, one present, two futures

The case of the sea battle can be described in terms of this figure. Suppose that m0 is today, that is, the moment at which (1) and (2) are uttered. h1 and h2 are histories that lead to different tomorrows: m1 is a peaceful tomorrow, while m2 is a tomorrow in which there is a sea battle. h1 and h2 have a part in common, that is, our past until today. The two portions of h1 and h2 that follow m0 are distinct possible futures. The contingency of the sea battle consists precisely in the existence of these futures.

Note that figure 6 shows two distinct tomorrows instead of one. Each of these two tomorrows belongs only to one history. However, this does not mean that it makes no sense to describe m1 and m2 as simultaneous. On the contrary, assuming that there is an absolute temporal axis, that is, that time can be measured from a point of view that is external to the histories, we can say that m1 and m2 are located at the same point along that axis. If we call instant an absolute temporal unit, definable as a set of equivalent moments, we can say that two moments that belong to different histories are in the same instant. In figure 7, i0 is the present instant, that is, the instant that includes m0, and i1 is the instant that includes m1 and m2.

Figure 7

Figure 7: The sea battle

The many-futures view is clearly in line with option 2. In the framework just sketched, future contingents can be evaluated as true or false at moments relative to histories. For example, (1) is true at m0 in h2 but false at m0 in h1. Similarly, (2) is true at m0 in h1 but false at m0 in h2. According to the supervaluationist definition of truth, this entails that (1) and (2) are neither true nor false at m0, so that bivalence does not hold. Instead, excluded middle holds. (3) is true at m0, for it is true at m0 both in h1, given that (2) is true at m0 in h1, and in h2, given that

(1) is true at m0 in h2. The two further theories considered in section 2.b fit the many-futures view equally well, in that they employ the same notion of truth relative to histories.

d. One Future

The third view—the one-future view—says that there is one future ahead of us, our future. This view has two versions. According to one of them—the thin red line—many possible futures depart from our present, but these futures are not metaphysically on a par because only one of them is actual. According to the other—divergence—we have a single future because we belong to a single history, the actual history, although there are other histories that are exactly like our history up to the present but have a different future. The key difference between the two versions concerns the possibility of overlap. To endorse the thin red line is to think that two histories can overlap, that is, that they can have some part in common. To endorse divergence, instead, is to conceive histories as entirely disconnected totalities. Here we will focus on divergence, although what will be said applies, mutatis mutandis, to the thin red line.

Figure 8 illustrates divergence. Imagine that we are in the salami below, and that the left portion of the salami above—the portion that precedes the slice—is identical to the left portion of our salami, but that the right portion of the salami above—the portion that follows the slice—differs from the right portion of our salami. In this case the two salami are divergent histories.

Note that figure 8 shows two presents, each of which belongs to a single history. This is not to say that it makes no sense to describe such moments as simultaneous. As in the many-futures view, simultaneity can be defined in terms of instants. Figure 9 represents the two histories considered above as horizontal lines, h1 and h2, and represents the instant that the two presents have in common as a vertical line that intersects h1 and h2. Our present, m0, is in h1 and differs from m1, which is in h2. However, m0 and m1 are simultaneous in the sense that they belong to the same instant i0.

Figure 8

Figure 8: Divergence

 

Figure 9

Figure 9: Two pasts, two presents, two futures.

The question is who the individuals in the other history, who are exactly like us up to now, are. Lewis, who defends divergence, calls such individuals counterparts. If we are in h1, then in h2 there are other individuals who are our counterparts. Just as we have a future, the right portion of h1, our counterparts have their own future, the right portion of h2 (Lewis 1986).

Now let us go back to the sea battle. Figure 10 represents two histories h1 and h2 that are exactly alike up to i0 but then differ. m0 and m1 are two distinct but qualitatively identical todays, each of which has its own tomorrow: m2 is a peaceful tomorrow, while m3 is a tomorrow in which there is a sea battle. Therefore, (1) is true at m1, while it is false at m0. Since m1 and m0 belong respectively to h2 and h1, this is to say that (1) is true in h2, while it is false in h1. Whether (1) is true or false simpliciter depends on which of the two histories is the actual history. If we are in m0 we will have peace, whereas if we are in m1 we will find ourselves in the middle of a sea battle.

Figure 10

Figure 10: The sea battle

It is important to note that being in a given history does not mean being in a position to discern that history from other histories. Suppose that we are in h1. Since m0 is qualitatively identical to m1, and the same goes for any moment that precedes m0, for us h1 is indistinguishable from h2. So at i0 we are not in a position to know whether we are in h1 or in h2. Consequently, we are not in a position to know whether our future includes m2 or m3. In a way, we do not know what will happen tomorrow because we do not know where we are.

The one-future view suits option 3. The framework just sketched preserves bivalence. Suppose, as before, that (1) is true at m1 and false at m0. Then, no matter which of the two histories is the actual history, (1) is either true or false. This is not to say that (1) is determinately true or determinately false. Assuming that determinate truth at a moment amounts to truth at all moments in the same instant, and that determinate falsity at a moment amounts to falsity at all moments in the same instant, (1) is neither determinately true at m1 nor determinately false at m0. Excluded middle is preserved as well. (3) is true both at m0 and at m1. Therefore, it is determinately true.

4. The Open Future

a. Alternative Possibilities

Most discussions on future contingents take for granted that fatalism is wrong. Despite this, it is not obvious what the right view is. The thought that underlies the rejection of fatalism is often expressed by saying that the future is open. The contemporary literature on future contingents, widely employs the metaphor of openness to characterize the view that the future is unsettled. Yet it is possible to understand openness in more than one way. This last section provides some clarifications about the claim that the future is open.

A simple and straightforward way to interpret the claim that the future is open is to define openness in terms of the existence of alternative possibilities: To say that the future is open is to say that, for some “p,” it is possible that p and it is possible that not-p. This interpretation is simple and straightforward because it equates the claim that the future is open with the pure negation of fatalism. As it turns out from section 1.c, fatalism is the claim that, for every “p,” either it is necessary that p or it is impossible that p. Consequently, its negation is the claim that, for some “p,” it is neither necessary nor impossible that p, that is, it is possible that p and it is possible that not-p.

If the openness of the future is understood in terms of the existence of alternative possibilities, then it is consistent with the three metaphysical views outlined in section 3. If one endorses the no-future view, one can say that, although there is presently nothing ahead of us, it is possible that what will exist is such that p and it is possible that what will exist is such that not-p. If one endorses the many-futures view, one can say that there are possible futures in which p and possible futures in which not-p. The same goes for the one-future view, even though in the case of divergence the possible futures have distinct pasts and distinct presents.

b. Indetermination

Another way to interpret the claim that the future is open is to define openness in terms of indetermination, understood as absence of determination: To say that the future is open is to say that nothing determines the future. This can mean two things: either that the future is not determined by some divine entity, or that the future is not determined by the laws of nature. Here we focus on the second reading, which became widespread by the early 21st century, although these considerations apply to the first as well.

The idea that every event is determined by the laws of nature goes back to antiquity and has been widely discussed in modern and contemporary philosophy. According to this idea, every event follows as an effect from some cause in accordance with the laws of nature. Determination may be defined as a relation between states, understood as global conditions in which the universe can be at an instant. Given a state S that obtains at i0 and given a state S0 that obtains at i1, S determines S0 if and only if the obtaining of S at i0, together with the laws of nature, entails that S0 obtains at i1. Determinism is the view that, for every instant, the state that obtains at that instant is determined by the states that obtained at previous instants (Hoefer, 2003).

None of the three metaphysical views outlined in section 3 entails determinism. Suppose that i0 is the present instant and that S is the state of the universe at i0. According to the no-future view, given an instant i1 later than i0, nothing exists in i1, even though when we will be in i1, another state S0 will obtain. The no-future view says nothing about the relation between S and S0, so it is consistent with the hypothesis that S does not determine S0. Now consider the many-futures view. Suppose, as in figure 7, that m0 is the present moment and that m1 and m2 are future moments that belong to i1. If S is the state that obtains at m0, while S0 and S00 are the states that obtain respectively at m1 and m2, then S determines neither S0 or S00, for it is compatible both with S0 and with S00. Finally, consider the one-future view. Suppose, as in figure 10, that m0 and m1 are in i0, and that m2 and m3 are in i1. If S is the state that obtains at m0 and m1—in that h1 and h2 are identical up to i0 while S0 and S00 are the states that obtain respectively at m2 and m3—then S determines neither S0 or S00, for it is compatible both with S0 and with S00.

It is important to note that indetermination is not the same thing as indeterminateness, understood as absence of determinateness. If determinateness is the property that a possible future has when it is completely defined in its spatial and temporal properties, then indetermination does not entail indeterminateness. It is consistent to claim, as in the case of branching or divergence, that indetermination holds because there are many possible futures, each of which is completely defined in its spatial and temporal properties. Indetermination and indeterminateness are independent properties.

c. Causal Power

A third way to interpret the claim that the future is open is to define openness in terms of causal power: To say that the future is open is to say that we can affect the future, in that our present actions have future effects. For example, if tonight we set the alarm on our phone to 7 a.m., the sound that the phone will emit tomorrow at 7 a.m. is an effect of the movements that we perform tonight.

The idea that our present actions have future effects is obviously consistent with the three metaphysical views outlined in section 3. In each of the three cases, it makes perfect sense to say that an event which occurs at a given time causes another event that occurs at a later time.

Note that the past does not depend on us in the same sense, because our present actions do not have past effects. This asymmetry can be described in terms of counterfactual dependence, as Lewis has suggested. The future counterfactually depends on the present, because it would be different if the present were different. Suppose that tonight we set the alarm on our phone to 7 a.m. It is correct to say that, if the alarm were not set, the phone would not emit any sound tomorrow at 7 a.m. Instead, the past does not counterfactually depend on the present, because it would not be different if the present were different. If the alarm were not set, what happened yesterday would remain exactly the same (Lewis 1979).

The claim that we can affect the future must not be confused with the claim that we can change the future, that is, that we can replace the future with another future. It is one thing is to say that a future event, such as the sound that the phone will emit tomorrow at 7 am, is caused by a present event; it is quite another thing is to say that a future event can be replaced by a different future event. The claim that we can change the future is hardly intelligible, or so it appears to most philosophers (an exception is Todd 2016). In any case, this claim seems incompatible with the three metaphysical views outlined in section 3. If the no-future view is true, then the future does not exist, so nothing can be changed. If the many-futures view is true, then there are many possible futures, so it makes no sense to say that we can change “the” future. And in any case, each of the possible futures is essentially identical to itself. Finally, if the one-future view is true, then there is a unique future, which cannot be changed.

d. Other Definitions

As it turns out from sections 4.a-4.c, there are three plausible interpretations of the claim that the future is open: The first is that, for some “p,” it is possible that p and it is possible that not-p; the second is that the future is not determined; and the third is that we can affect the future. Each of these interpretations is consistent with the three metaphysical views outlined in section 3: No matter whether one endorses the no-future view, the many-futures view, or the one-future view, one can coherently claim that the future is open. Since options 1-3 accord, respectively, with the no-future view, the many-futures view, and the one-future view, this suggests that the claim that the future is open, on the three interpretations considered, is compatible with any solution to the problem of future contingents.

Of course, the three interpretations considered are not the only admissible interpretations. Other interpretations are possible. Nothing prevents us from defining openness in terms of some specific logical option or metaphysical view. The question then arises of whether the future is really open in the sense defined. Merely stipulating that openness amounts to this or that condition does not provide any reason to think that the stipulation captures some pre-theoretical intuition.

Some philosophers have suggested that the openness of the future amounts to the failure of bivalence for future-tense sentences (as in Markosian 1995). On this interpretation, the claim that the future is open yields substantive consequences, for it licenses options 1 and 2 while it rules out option 3. However, as some have observed (Barnes and Cameron 2009; Besson and Hattiangadi 2014), it is controversial whether the future is open in this sense. Aristotle needed an argument to show that bivalence does not hold for future contingents.

Other philosophers have suggested that the openness of the future amounts to the many-futures view: To say that the future is open is to say that there are multiple branching futures which are metaphysically on a par (as in MacFarlane 2003). On this interpretation, again, the claim that the future is open yields substantive consequences, for it rules out both the no-future view and the one-future view. However, it is controversial whether the future is open in this sense.

The controversy emerges clearly in the dialectic between branching and divergence. According to the advocates of the many-futures view, divergence does not preserve openness. Suppose that Betty wonders whether she can become an internationally acclaimed photographer. As far as divergence is concerned, the answer is affirmative if Betty will become a door-to-door cosmetics seller, but there is a history in which another individual very similar to Betty—call her Betty*—will become an internationally acclaimed photographer. The fact, however, is that what Betty wonders—what concerns her—is whether she, Betty, can become an internationally acclaimed photographer, not whether another person has that opportunity. It does not seem that Betty’s future be open if it only includes the sale of cosmetics. The openness of the future seems to imply that the alternative possibilities not only exist, but that they exist for the same individuals.

To this objection it might be replied that divergence does not deny that one and the same individual has alternative possibilities. Let us assume that “Betty can become an internationally acclaimed photographer” is true. Insofar as divergence explains the truth of this sentence in terms of the existence of a history in which Betty* becomes an internationally acclaimed photographer, the individual to whom it is correct to attribute the modal property of possibly becoming an internationally acclaimed photographer is Betty, not Betty*. Certainly, this explanation cannot be understood as a description of what Betty has in mind when she wonders whether she can become an internationally acclaimed photographer. However, the same holds for any other explanation of the same fact. Just as Betty does not think about Betty*, she does not think that she inhabits two histories that share a common segment and branch towards the future.

It is difficult to judge who is right. The objection against divergence stems from a line of thought that goes back to Kripke and that is antithetical to the theory of counterparts defended by Lewis. According to this line of thought, the truth or falsity of a sentence that attributes a modal property to an individual depends on what happens to the same individual in possible worlds other than the actual world. For example, Kripke claims that the sentence, “It might have been the case that Aristotle was not a philosopher,” is true because there are possible worlds in which Aristotle, the same Aristotle, was not a philosopher. The question of which of these two positions is preferable concerns possible worlds in general, and cannot be settled simply by appealing to intuitions.

5. References and Further Reading

  • Barnes, E. and Cameron, R. 2009. The Open Future: Bivalence, Determinism and Ontology. Philosophical Studies, 146:291–309.
  • Besson, C. and Hattiangadi, A. 2014. The Open Future, Bivalence and Assertion. Philosophical Studies, 162:251–271.
  • Bigelow, J. 1996. Presentism and Properties. Philosophical Perspectives, 10:35–52.
  • Bourne, C. 2006. A Future for Presentism. Oxford: Oxford University Press.
  • Broad, C. D. 1923. Scientific Thought. London: Routledge.
  • Casati, R. and Torrengo, G. 2011. The Not So Incredible Shrinking Future. Analysis, 71:240–244.
  • Correia, F. and Rosenkranz, S. 2018. Nothing To Come: A Defence of the Growing Block Theory of Time. Cham, Switzerland: Springer.
  • Dowden, B. 2018. Time. Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/time/.
  • Hoefer, C. 2003. Causal Determinism. Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/determinism-causal/
  • Horwich, P. 1987. Asymmetries in Time. Cambridge (MA): MIT Press.
  • Iacona, A. 2013. Timeless Truth. In Around the Tree: Semantic and Metaphysical Issues Concerning Branching and the Open Future, edited by F. Correia and A. Iacona, 29–45. Cham, Switzerland: Springer.
  • Iacona, A. 2014. Ockhamism without Thin Red Lines. Synthese, 191:2633–2652.
  • Lewis, D. 1979. Counterfactual Dependence and Time’s Arrow. Noûs, 13:455–476.
  • Lewis, D. 1986. On the Plurality of Worlds. Oxford: Blackwell.
  • Lukasiewicz, J. 1970. On Three-Valued Logic. In Selected Works, edited by L. Borkowski, 87–88. Amsterdam: North-Holland.
  • MacFarlane, J. 2003. Future Contingents and Relative Truth. Philosophical Quarterly, 53:321–336.
  • MacFarlane, J. 2008. Relative Truth. In Truth in the Garden of Forking Paths, edited by M. Garcia-Carpintero and M. Kölbel, 81–102. Oxford: Oxford University Press.
  • Malpass, A. and Wawer, J. 2018. Back to the Actual Future. Synthese.
  • Markosian, N. 1995. The open past. Philosophical Studies, 79:95–105.
  • Mellor, H. 1998. Real Time II. London: Routledge.
  • Ockham, W. 1978. Tractatus de praedestinatione et de praecientia dei respectu futurorum contingentibus. In Opera philosophica et theologica, volume II. St. Bonaventure, New York: The Franciscan Institute.
  • Øhrstrøm. P. 2009. In Defence of the Thin Red Line: A Case for Ockhamism. Humana Mente, 8:17–32.
  • Øhrstrøm, P. and Hasle, P. F. V. 1995. Temporal Logic. From Ancient Ideas to Artificial Intelligence. Dordrecht: Kluwer.
  • Perloff, M., Belnap, N., and Xu, M. 2001. Facing the Future. Oxford: Oxford University Press.
  • Prior, A. N. 1967. Past, Present and Future. Oxford: Clarendon Press.
  • Prior, A. N. 1970. The Notion of the Present. Studium Generale, 23:245–248.
  • Putnam, H. 1967. Time and Physical Geometry. Journal of Philosophy, 64:240–247.
  • Rosenkranz, S. 2012. In Defence of Ockhamism. Philosophia, 40:617–31.
  • Sider, T. 2001. Four Dimensionalism. Oxford: Oxford University Press.
  • Smart, J. J. C. 1963. Philosophy and Scientific Realism. Humanities Press, 1963.
  • Taylor, R. 1955. Spatial and Temporal Analogies and the Concept of Identity. Journal of Philosophy, 52:599–612.
  • Thomason, R. H. 1984. Combinations of Tense and Modality. In Handbook of Philosophical Logic, volume 2, edited by D. Gabbay and G. Guenthner, 135–165. Dordrecht: Reidel.
  • Todd, P. 2016. On Behalf of a Mutable Future. Synthese, 193:2077–2095.
  • Tooley, M. 1997. Time, Tense, and Causation. Oxford: Oxford University Press.
  • van Fraassen, B. 1966. Singular Terms, Truth-Value Gaps, and Free Logic. Journal of Philosophy, 63:481–495.
  • von Wright, G. H. 1984. Determinism and Future Truth. In Truth, Knowledge, and Modality, 1–13. Oxford: Blackwell.
  • Wawer, J. 2014. The Truth about the Future. Erkenntnis, 79:365–401.
  • Williams, D. C. 1951. The Myth of the Passage. Journal of Philosophy, 48:457–472.

Author Information

Andrea Iacona
Email: andrea.iacona@unito.it
University of Turin
Italy

Metaphysics of Science

Metaphysics of Science is the philosophical study of key concepts that figure prominently in science and that, prima facie, stand in need of clarification. It is also concerned with the phenomena that correspond to these concepts. Exemplary topics within Metaphysics of Science include laws of nature, causation, dispositions, natural kinds, possibility and necessity, explanation, reduction, emergence, grounding, and space and time.

Metaphysics of Science is a subfield of both metaphysics and the philosophy of science—that is, it can be allocated to either, but it exhausts neither. Unlike metaphysics simpliciter, Metaphysics of Science is not primarily concerned with metaphysical questions that may already arise from everyday phenomena such as what makes a thing (a chair, a desk) the very thing it is, what its identity criteria are, out of which parts is it composed, whether it remains the same if we exchange a couple of its parts, and so forth. Nor is it concerned with the concrete entities (superstrings, molecules, genes, and so forth) postulated by specific sciences; these issues are the subject matter of the special philosophies of science (for example, of physics, of chemistry, of biology).

Metaphysics of Science is concerned with more abstract and general concepts that inform all of these sciences. Many of these concepts are interwoven with each other. For example, metaphysicians of science inquire whether dispositionality, lawhood, and causation can be accounted for in nonmodal terms; whether laws of nature presuppose the existence of natural kinds; and whether the properties of macrolevel objects supervene on dispositional or nondispositional properties.

This article surveys the scope (section 1), historical origin (section 2), exemplary subject matters (section 4), and methodology (section 5) of Metaphysics of Science, as well as the motivation that drives it (section 3).

Table of Contents

  1. What Is Metaphysics of Science?
    1. Metaphysics and Metaphysics of Science
    2. Philosophy of Science and Metaphysics of Science
    3. Explication
  2. Metaphysics of Science in the 20th (and Early 21st) Century
    1. The Logical Empiricist Critique of Metaphysics
    2. The Return to Metaphysics
    3. Naturalized Metaphysics and Inductive Metaphysics
  3. Why Do We Need Metaphysics of Science?
  4. Sample Topics in Metaphysics of Science
    1. Dispositions
    2. Counterfactuals and Necessities
    3. Laws of Nature
    4. Causation
    5. Natural Kinds
    6. Reduction, Emergence, Supervenience, and Grounding
    7. Space and Time
  5. The Methodology of Metaphysics of Science
    1. Theoretical Virtues
    2. Inference to the Best Explanation
    3. Indispensability and Serviceability Arguments
    4. Extensional Adequacy and the Canberra Plan
  6. References and Further Reading

1. What Is Metaphysics of Science?

Metaphysics of Science is a subdiscipline of philosophy concerned with philosophical questions that arise at the intersection of science, metaphysics, and the philosophy of science. The term “Metaphysics of Science,” which combines the names of these disciplines, is of 20th century coinage. In order to fully understand what Metaphysics of Science is, it is helpful to clarify how it differs from both metaphysics simpliciter and philosophy of science.

a. Metaphysics and Metaphysics of Science

Metaphysics simpliciter seeks to answer questions about the existence, nature, and interrelations of different kinds of entities—that is, of existents or things in the broadest sense of the term. It enquires into the fundamental structure of the world. For example, it asks what properties are, how they are connected to the entities which have them, and how the similarity of objects can be explained in terms of their properties. The subject matter of metaphysics is somewhat heterogeneous: topics include the composition of complex entities (such as tables, turtles, and angry mobs), the identity and persistence of objects, problematic kinds of entities (that is, entities about which it is unclear whether or in what sense they exist at all, like numbers and fictional objects such as unicorns), and many more. Metaphysics is usually understood as working at an abstract and general level: it is not concerned with concrete individual things or particular relations but rather with kinds of things and kinds of relations.

Metaphysics of Science is not completely disjoint from metaphysics simpliciter. Not only does it draw on the pool of methodological tools employed in metaphysics, but there is also substantial overlap regarding subject matter. Metaphysicians have their own reasons, independently of science, to investigate causation, modality, and dispositional properties, for example. Like space and time, these concepts pertain also to everyday phenomena. Although Metaphysics of Science, too, is usually attentive to our everyday intuitions and opinions about such phenomena, it engages in a specific investigation of the roles these concepts play in scientific contexts.

Metaphysicians of science often take scientific realism for granted—that is, they hold the philosophical stance that the sciences are apt to find out what the world is really like, that they track the truth, and that the entities they postulate exist. Antirealism about science, on the other hand, often coincides with a skeptical or agnostic attitude towards metaphysics. In the context of some broader metaphysical inquiries, scientific endeavors might well be seen as but one way to the truth. A mainly science-guided metaphysics might even be seen as mistaken (as, for example, in phenomenological approaches (compare Husserl 1936; 1970)).

Moreover, metaphysicians of science demand of themselves that they pay attention to discourses within the sciences. For example, some physicists like Richard Feynman (1967) speak of fundamental symmetry principles and conservation laws as being constraints on other, less fundamental laws of nature (they are the laws of laws, so to speak), rather than being laws about what is going on in the world. Metaphysicians working to develop a philosophical theory of nomicity (lawhood), therefore, should allow for the possibility of there being laws of nature as well as laws of laws.

In short, Metaphysics of Science is that part of metaphysics that enquires into the existence, nature, and interrelations of general kinds of phenomena that figure most prominently in science. Also, Metaphysics of Science grants the sciences authority in their categorization of the world and in their empirical findings.

In terms of content, the transition between Metaphysics of Science and science might well be smooth with no clear border, so the distinction might be one that can only be made sociologically, regarding the departmental structure of universities or focusing on the practitioners and their methods of inquiry. Whereas many physicists (although perhaps not all: see theoretical physics) engage in experimental work, metaphysicians are happy merely to consult the findings of their empirically working colleagues from the science departments.

b. Philosophy of Science and Metaphysics of Science

On the other hand, Metaphysics of Science may just as well be called a part of the philosophy of science. Philosophy of science consists of the philosophical reflection on the preconditions, practices, and results of science in general and of the particular sciences (such as physics, biology, mathematics, sociology, and so forth). Many philosophers of science are engaged in debates surrounding science as a (putative) source of knowledge: what makes scientific results especially reliable? That is, what distinguishes science from non- or pseudoscience, everyday knowledge, and philosophy? Which kinds of methods do and should scientists employ? What is scientific progress? Are scientific theories true (despite being fallible)? Are we ever justified in advocating a particular scientific theory, given that most scientific theories of the past have been replaced by others (like, for example, Newtonian mechanics was replaced with relativistic mechanics)? Can the sciences be unified into one big Theory of Everything? Together, these questions constitute the epistemology of science, that part of the philosophy of science which studies scientific knowledge.

Metaphysics of Science complements the epistemology of science. Whereas the latter asks questions of the sort, “How do we know of x?” Metaphysics of Science enquires, “What is the nature of x?” where “x” is a placeholder for some (kind of) entity, state of affairs, or fact discovered or postulated by science.

The task of Metaphysics of Science is not simply to list these entities or facts. Rather, it operates at a higher level of abstraction. For example, whereas the particular sciences inquire into specific causal relations—or, differently put, into some particular relation that holds between two particular measurable quantities, like the concentration of a drug and the soothing effect it has on headaches—Metaphysics of Science attempts to say what causation is in general. That is, it asks exactly which features a relation must have in order to count as a causal relation (like regular occurrence or modal force), and what the respective relata are. In short, Metaphysics of Science enquires into the key concepts of science not at the empirical but at a more abstract and general level.

c. Explication

Philosophers disagree about which key concepts constitute the subject matter of Metaphysics of Science. Some (like Mumford and Tugby 2013, 6) argue for a narrow interpretation of the term and claim that Metaphysics of Science is primarily concerned with concepts which are relevant to all branches of science, because without these central concepts, science would not be possible. For example, they suggest (16) that kindhood, lawhood, and causation are concepts of this kind. Others, for example the Society for the Metaphysics of Science, are more permissive: they also include in the domain of Metaphysics of Science issues that arise in only some branches of science, such as problems regarding species (biology), intentionality and consciousness (psychology), and social kinds (social science). Probably due to the emphasis that 20th century philosophy of science placed on physics, the larger part of debates within Metaphysics of Science revolves around topics that occur most prominently within the realm of physics, but which figure or bear connections to the other sciences as well:

  • laws of nature, causation, and dispositions
  • necessity, possibility, and probability
  • (natural) kinds and essences
  • reduction, emergence, and grounding
  • space and time.

Regardless of whether philosophers defend a narrow or a more permissive notion of Metaphysics of Science, they agree that the concepts in question are in need of explanation. At the very least, such an explanation must show how the concepts cohere. Some metaphysicians take one or more of the concepts they discuss (alongside their related phenomena) to be primitive, meaning that these concepts cannot be analyzed in terms of other concepts and their related phenomena cannot be subsumed under other phenomena. Typically, they then proceed to show that other concepts (alongside their related phenomena) can be explicated in terms of these primitive concepts. (For an exemplary account of some potentially primitive concepts and how they cohere, see parts a through d in section 4.)

As a discipline in its own right, Metaphysics of Science is still relatively young, especially when compared to other areas of philosophy (such as epistemology and ethics). Its topics, however, are not. For as long as science has existed, there has been metaphysical reflection on central scientific concepts. Metaphysics of Science of the 21st century differs from natural philosophy of the past in that the aspiration of natural philosophy was to speculatively describe the world as it is, whereas Metaphysics of Science is more concerned with what the world would be like if our best scientific theories were to turn out true (compare Carrier 2007, 42).

2. Metaphysics of Science in the 20th (and Early 21st) Century

a. The Logical Empiricist Critique of Metaphysics

Of the many historical roots of modern philosophy of science, Logical Empiricism (often interchangeably called “Logical Positivism”) stands out. The Logical Empiricists and their sympathizers (especially Rudolf Carnap, Moritz Schlick, Otto Neurath, Hans Reichenbach, Alfred Ayer, and Carl Gustav Hempel) were the progenitors of a new kind of philosophy (that directly relates to the philosophical work of Gottlob Frege, Bertrand Russell, and Ludwig Wittgenstein, which later came to be known as “analytic philosophy”). They influenced many of the most prominent philosophers of the late 20th century (among them Karl Popper and Willard Van Orman Quine). In a sense, it is with them and their themes (laws of nature, causation, counterfactuals) that modern Metaphysics of Science begins, although they would have rejected much that currently goes by that name. Their ideas sparked many of the debates central to Metaphysics of Science.

In the 1930s, the Logical Empiricists proposed an empiricist, positivist program. They held that experience is our only source of nondefinitional knowledge (hence Logical Empiricism) and that the task of philosophy is logical analysis; that is, analysis of the logical features of and relations between sentences (hence Logical Empiricism). According to the Logical Empiricists, all the empirical propositions we believe can be reduced to so-called protocol sentences, which are direct renderings of our perceptual experience, or “the given.” Only if we know how a sentence could in principle be verified—that is, which possible observations would result in our accepting it as true—can we say that the sentence is meaningful. This so-called verifiability criterion of meaning has one purpose in particular, namely, to exclude metaphysical speculation from the realm of meaningful discourse. For example, the metaphysical sentence “every thing has an immaterial substance” cannot be empirically verified; hence, according to the verifiability criterion of meaning, it is meaningless. A radical antimetaphysical stance was one of the key tenets of Logical Empiricism. Note that verificationism recasts the Empiricists’ epistemic doctrine that all factual knowledge comes from sense perception as a semantic doctrine. Indeed, if we believe that what we know is expressed (or at least expressible) in meaningful sentences, then the transition from Empiricist epistemology to semantics is straightforward: all factual knowledge is expressed in meaningful sentences and only those sentences for which we are able to give a method of verification in observation are meaningful.

It soon became apparent, however, that Logical Empiricism, and especially the verifiability criterion of meaning, houses some serious flaws. Two major blows came from Willard Van Orman Quine’s seminal paper, Two Dogmas of Empiricism (1951), which argued that two assumptions the principle of verification has to presuppose are untenable: the first is that there is a clear distinction between analytically true and synthetically true sentences. The second is that each meaningful sentence faces the tribunal of sense experience on its own for its verification or falsification (rather than holistically in concert with other sentences).

Logical Empiricism faces further problems. Clearly, the Logical Empiricists held the sciences in high esteem. Usually, it is taken for granted that the sciences aim to discover natural laws and that they research properties such as electro-conductivity of different materials, reactiveness of chemical compounds, and fertility of organisms. Prima facie, it seems that many laws of nature can be expressed as general statements, that is, as statements of the form “any particular thing x which has property F also has property G” (in logical notation: ∀x(Fx → Gx)). For example, we say that all samples of metal expand when heated. But universal generalisations of this kind cannot ever be proven true by actual empirical observations (because they have far more instances, maybe infinitely many, than could ever be observed and confirmed), so the verifiability criterion rules out (at least some) laws of nature as meaningless. Even if this consequence could be avoided, what the laws of nature say is often taken to not be merely accidentally true, but to ensue with modal force. Empirically, we cannot account for modality: we can only observe what is actually the case, not what else is possibly or necessarily true.

Similarly, Logical Empiricism runs into problems regarding dispositional properties. Everyday properties such as solubility and scientific properties like conductivity cannot easily be reduced to the observable qualities of soluble or conductive objects. For example, a sugar cube is a somewhat solid object, much like a matchstick, but if we were to place the sugar cube in water, it would dissolve, whereas the matchstick would not. Its manifest properties such as solidity, color, and taste provide no clue as to what will happen to the sugar cube if placed in water. What is more, even if a particular sugar cube (or even all the sugar cubes in the world) were never placed in water at all (or if it were placed in water but the water was already supersaturated with sugar so that the sugar cube would not dissolve in that particular situation), it would nevertheless retain its dispositional property of being soluble, although there is nothing about it that we observe which hints at its solubility. An analogous case can be made regarding dispositional properties discussed in the sciences, like conductivity or chemical bonding propensity, and similarly, regarding science’s theoretically postulated, not directly observable, entities like quarks or superstrings. Because dispositional properties, theoretical entities, and universally generalized laws of nature appear to belong to the conceptual inventory of the sciences, Logical Empiricism, which fails to adequately account for them, quickly became an unattractive option. (For more on laws of nature and dispositions, see section 4c and 4a.)

b. The Return to Metaphysics

The failure of Logical Empiricism to cope with some of the key concepts of science eventually led to the development of Metaphysics of Science. Philosophers realized that if concepts such as law of nature and necessity could not be eliminated by reduction to observation terms, it must then be legitimate to examine them thoroughly, by whatever means seem fit. The most likely candidate to fulfill this task is metaphysics. (For an overview of methods commonly applied in Metaphysics of Science, see section 5.)

The development of Metaphysics of Science occurred simultaneously with the revival of metaphysics in the analytic tradition of philosophy, a tradition that was rooted in Logical Empiricism (as well as in the linguistic turn, manifested by the ideal and ordinary language philosophies of the late 19th and mid-20th centuries). Analytic philosophers were initially hostile towards metaphysical questions. They rejected questions which transcended empirical observation or fell outside of the scope of the sciences. However, philosophers like Willard Van Orman Quine (most famously in his essay “On What There Is” (1948)) and Peter Strawson (especially in his monograph Individuals (1959)) soon realized that there is a supposedly innocent way of practicing metaphysics by describing human conceptual schemes rather than by speculatively conjuring up grand metaphysical edifices. Instead of laying claims to knowledge of the unobservable, they focused on finding out how humans in fact conceptualize reality—in their everyday language (Strawson) or their scientific theories (Quine) where, if stronger authority is given to the sciences, the latter may revise the commitments of the former. Quineans favor the revision and are, hence, closer to the attitude of Metaphysics of Science, where Strawsonians give much credibility also to folk’s general metaphysical background assumptions.

Encouraged by the failure of Logical Empiricism and the fact that metaphysical questions were once again beginning to be the subject of philosophical discussion, philosophers developed a renewed interest in metaphysics. They gradually grew confident in talking not merely about observations, semantics, and language, but also about reality.

Another significant step towards the return to metaphysics was the development of modal logic. Begun by Carnap—for example, in his Meaning and Necessity (1947)—the logic of necessity, possibility, and counterfactuality was refined considerably by Ruth Barcan Marcus (1947), Saul Kripke (1963), and David Lewis (1973a). Later, with Kripke’s Naming and Necessity (1980) and Hilary Putnam’s “The Meaning of ‘Meaning’” (1975), the formalisms were given ontological interpretations and the belief in necessity in nature gained new justifications. Building on these developments further still, even (Aristotelian) essences saw their revival: see Kit Fine’s work (1994) and its application within Metaphysics of Science by, for example, Brian Ellis (2001) and Alexander Bird (2007).

The return to metaphysics in the 20th century was not merely a trailblazing event for the development of modern Metaphysics of Science; rather, the two evolved alongside each other. For example, when it became acceptable for metaphysicians to speak of necessities in nature and discuss statements like “Water is necessarily H2O,” this paved the way for a realistic reading of other modalities, like nomological necessity or counterfactuality. These are, as we will see (in section 4b and 4c), central notions in debates on the nature and status of laws of nature in Metaphysics of Science.

c. Naturalized Metaphysics and Inductive Metaphysics

In the early 21st century, some philosophers argued for a naturalization of metaphysics. Their argument typically rests on the fact that the sciences appear to surpass metaphysics in many respects. The sciences, they claim, have a shared stock of accepted theories, a pool of respected methods and institutionalized standards, and they have predictive and technological successes to show for themselves. In contrast, there is long lasting dissent over positions and methods in metaphysics that rarely ever gets dissolved, and it is unclear what would even count as criteria for metaphysical success. As some metaphysical questions—such as “What is the world ultimately made of?” and “What is life?”—also belong to the domain of the sciences (physics and biology, respectively), naturalists insist that we must draw upon scientific findings to properly answer them.

Naturalistic metaphysicians come in all shapes and sizes. Some naturalists wish to prohibit any metaphysics that is not scientifically evaluable (compare Ladyman and Ross 2007). Some suggest that we should take our clues from scientific practice. For example, Tim Maudlin (2007) argues that lawhood is primitive, as working scientists see no need to analyze the concept. (For more on Maudlin’s position, see section 4c.) Others still allow for the possibility of relevant questions which may not have straightforwardly scientific answers. For example, consider the question “What is it for a thing to persist through time?” Imagine we take a ship out to sea and, little by little, replace every single part of it until none of the original parts remain. Certainly, science can describe how the ship changes, but it will not tell us whether the ship we sail home is still the same as the ship that put out to sea. The latter becomes a pressing, genuinely metaphysical problem, especially when we ask an analogous question about a person’s change and persistence through time.

What is important to remember is that although a naturalized metaphysics may, in a sense, also be called a “Metaphysics of Science,” its proponents may have a very different sort of metaphysics in mind than that presented in section 4.

In the 21st century, some philosophers have stressed that Metaphysics of Science could well be an inductive/abductive enterprise that, just as the sciences do, generalizes empirical data and builds explanatory models on that basis (Paul 2012; Williamson 2016; Schurz 2016; the research group Inductive Metaphysics). (Interestingly, precursors of the idea of an inductive/abductive metaphysics developed simultaneously with Logical Empiricism (Scholz 2018).) If so, metaphysical hypotheses might turn out to be fallible, only approximately true, and contingent.

3. Why Do We Need Metaphysics of Science?

In section 1 it was said that Metaphysics of Science examines the key concepts of science. But why do philosophers even bother to argue over issues in Metaphysics of Science? Is it not relatively clear what the basic concepts in science are and what they mean? Surely scientists know very well what they mean to say when they talk about the solubility of sugar, the second law of thermodynamics, and the relativity of space-time?

What inspires Metaphysics of Science is, of course, the idea that there is more to know about these phenomena and the concepts involved than science can say. Think of causation, for example. The concept of causation is commonsensical: we encounter causal processes in everyday life, like when we hit a golf ball with a putter and the ball begins to move, or when we drop a glass and it shatters. We intuitively distinguish these causal processes from noncausal processes. For example, if somebody in the next room sneezes as you raise your arm, you just know that raising your arm was not the cause of the other person’s sneezing. Still, it is quite complicated to say what establishes a causal connection between two events and what exactly distinguishes the putter-and-golf-ball scenario from the raise-arm-and-sneeze incident. Science records measurements and reveals statistical correlations between phenomena. It also has apt intuitions about whether two events are indeed causally connected or whether they merely co-occur accidentally, albeit regularly. Yet science is rarely interested in a general overall theory (detached from particular, concrete cause-effect relations) of what exactly distinguishes causes from accidents. Concepts such as causation or laws of nature, although relevant for science, are rarely the subject matter of science itself.

Science and Metaphysics of Science have different but complementary approaches to reality: the scientist’s work in this respect is predominantly empirical and consists in finding instantiations—describing particular causal interactions, listing things which are disposed in certain ways, pinning down particular laws of nature, and so on—while the metaphysician’s focus is on understanding and clarifying general concepts or the corresponding phenomena (like causation, disposition, and law of nature).

Still, the critic may object that even if the metaphysician’s and the scientist’s approaches to reality are indeed complementary, we can do perfectly well without Metaphysics of Science. For example, if science manages to find out the different variables and constants that determine how things in the world hang together, why do we also need to know what the general characteristics of a law of nature are or how that notion can be analyzed in terms of other notions? Isn’t this superfluous information? Clearly, scientists do not need metaphysicians to tell them about causation or dispositions in order to perform their research. Nevertheless, metaphysicians of science believe that questions regarding the existence and nature of causation, natural kinds, and necessity are valuable in their own right. At the very least, they are pressing questions that cannot be ignored by those who yearn to thoroughly understand the world we live in. By way of example, consider the dispute between defenders of Humean supervenience and antiHumeans, which revolves around the question of whether there are necessities in nature or not. (See 4a for a brief account of the debate.) Clearly, this is not a question that can be answered by purely scientific methods, but it is one that metaphysicians will nevertheless take to be meaningful and profound.

Some of the issues discussed in Metaphysics of Science are also relevant for practical contexts. For example, failure to render assistance (in case of an accident, a medical emergency, or the like) can lead to prosecution or social repercussions due to immoral behavior. However, you can only be held legally and morally responsible for events you are also causally responsible for. Accordingly, both ethics and law require a concept of causality that accounts not just for positive but also for negative causation, that is, causation by the absence of an event or act. If you pass an unconscious person lying on train tracks and fail to alert the authorities or pull him off the tracks, then you are (partly) causally responsible for his death if he is later killed by a train. Thus, although many questions within Metaphysics of Science are primarily aimed at complementing science, its debates may have far-reaching consequences in other fields as well.

To more fully understand the difference between the scientific and the metaphysical approach to the key scientific concepts that constitute the subject matter of Metaphysics of Science, it is helpful to consider samples of actual work in Metaphysics of Science (section 4) and to take a closer look at the methodology employed (section 5).

4. Sample Topics in Metaphysics of Science

As Metaphysics of Science is the study of the key concepts of science, its subject matter depends directly on what the sciences study and which concepts they employ. Because there are many different branches of science, there are also many potential topics for metaphysicians to discuss. It is impossible to name them all in a survey article, much less discuss them in detail. However, it is practically impossible to fully grasp what Metaphysics of Science is from general definitions only. (The same is true of metaphysics in general. No layperson will understand what metaphysicians do from hearing that metaphysics is the study of the fundamental structure of reality.)

In order to give the reader an idea of both the scope of Metaphysics of Science and its practice, this section briefly and tentatively introduces seven debates which have preoccupied metaphysicians of science in the past: counterfactuals and necessities, dispositions, laws of nature, causation, natural kinds, reduction and related concepts, and space and time. (See the respective articles for more information on modal logic and modality, laws of nature, reductionism, emergence, and time.)

a. Dispositions

Some objects have dispositional properties. For example, sugar is soluble, matchsticks are inflammable, and porcelain vases are fragile. Properties like solubility or fragility are often conceived of as becoming manifest only under so-called “triggers” or “stimulus conditions,” which set off the manifestation of the dispositional property. For example, for a sugar cube to manifest its solubility by dissolving, it must be placed in water.

Not all properties are like that. So-called categorical properties need no stimulus; they are always manifest. Just think of the properties of being solid, having a certain molecular structure (for example, being H2O), being rectangular, and so on. The distinction between categorical and dispositional properties is often drawn with the following three features in mind:

(i) Untriggered dispositions are not directly observable, whereas many categorical properties are. For example, from looking at some sort of powder, we cannot tell whether it is soluble or not. Looking at a football, we immediately see that it is round.

(ii) Because dispositional properties bestow objects with possibilities (of behaving in certain ways under certain circumstances), they are said to be modal properties: they imply, by their very nature, what can, might, or (given certain circumstances) must be the case. Categorical properties are not usually conceived of in this way.

(iii) Dispositional properties are often identified with productive powers. For example, scratching a match is not enough for it to light up; the match’s inflammability, too, is causally responsible for the flame. Usually, no such productive, causal force is directly associated with categorical properties.

Dispositional properties are not just a phenomenon we encounter in everyday contexts, but in science as well. For example, the property of being charged appears to fit this profile: it is not directly observable, it determines how objects would behave under certain conditions, and an object’s charge can be a vital factor in causal processes. Dispositionality has hence been of interest to Metaphysics of Science since its very beginning. In fact, the failure of Logical Empiricism to properly account for dispositional properties played a seminal role in the emergence of the discipline (see section 2a).

Because of their shared belief that all of our knowledge ultimately reduces to observational experience, Logical Empiricists like Rudolf Carnap (1936) attempted to account for dispositional properties in terms of observational properties using a simple conditional to connect the trigger to the manifestation: to say that a sugar cube is soluble just means that if we put it in water, it will dissolve. This and similar attempts at reduction fail, however, as they do not account for the modal behavior of disposed objects. For example, they do not supply a basis on which to ascribe (or not to ascribe) solubility to objects which have never been placed in water. This strikes us as odd, as it does not correspond to our everyday practice.

In order to adequately capture the modal nature of dispositions, philosophers soon suggested that we employ a counterfactual connective instead of the simple conditional. To say that some object has a disposition, they argued, means just that if the object were exposed to the trigger conditions, the disposition would manifest. This approach faces at least two problems. First, it requires a theory that specifies truth conditions for counterfactual conditionals (see section 4b). Second, there are some interesting counterexamples to the effect that under certain conditions we would intuitively ascribe dispositions to objects for which the proposed analysis fails (as in Charles Martin’s 1994 electro-fink example).

Although early attempts at reducing dispositions to categorical properties have failed, problems like the above have convinced some philosophers that we should strive for a reductive analysis after all. The philosophical position that holds that all properties are categorical and that supposedly dispositional properties can somehow be reduced to categorical properties is called “categoricalism.” For many categoricalists, a large part of their motivation comes not from Logical Empiricism but a fundamental insight of classical empiricism. David Hume famously observed that necessary connections, like those between causes and their effects, cannot be detected empirically. Hence, Hume concludes, we have no reason to assume that any sort of productive, necessary, or modal connection of events in nature exists. (This has come to be known as Hume’s Dictum.) Twenty-first century Humeans, too, claim that there are no necessary connections in nature. Consequently, they deny that there are irreducible, metaphysically fundamental dispositional properties that seem to imply some sort of necessary or modal connection between the trigger and the manifestation.

However, as reduction proves to be notoriously complicated, other philosophers opt for dispositionalism instead, which is, in its most radical form (pan-dispositionalism), the view that all properties are of a dispositional nature. Both categoricalism and pan-dispositionalism are monistic theories, as both claim that there is, at the fundamental level, only one type of property. It is also possible for philosophers to hold a neutral or dualistic view, according to which there are both categorical and dispositional properties at the fundamental level of reality.

The debate over dispositions has had substantial impact on other debates within Metaphysics of Science and vice versa. For example, some philosophers argue that laws of nature and causation are grounded in dispositional properties: a law of nature like “Like-charged objects repel each other” could well be true because of the dispositional nature of charge, and causal successions of events could be determined by the dispositional properties of objects involved (for example, wood paneling can be a partial cause of a house fire because it is inflammable). Other philosophers see the direction of dependence exactly the other way around: dispositions depend on laws of nature, because if the laws of nature were different, objects might have different dispositions. For example, if the laws of ionic bonding were different, salt might not dissolve in water. Similarly for causation: maybe salt has its disposition to dissolve because its ionic structure is a potential cause of dissolving. Hence, the debate over dispositions should not be viewed in isolation.

b. Counterfactuals and Necessities

We learned above that a central feature of dispositions is that they establish a modal relationship between the disposed object’s being in the trigger condition(s) and the disposition’s manifestation. A plausible candidate for understanding the nature of this modal relationship is counterfactual dependence. The standard notation for counterfactual dependence reads □→ q: if p were the case, then it would be the case that q. If a sugar cube is soluble, then that means, at least in part, that if it were placed in water, it would dissolve.

The sentential connective □→ is an intensional connective, which means that the truth value of the entire conditional cannot simply be read off the truth values of the antecedent and the consequent. The reason is easily understood: counterfactual conditionals describe counterfactual situations, which means that both the antecedent and the consequent are usually not currently true. Yet some such counterfactuals with a (currently) false antecedent and a (currently) false consequent are true (the above one capturing solubility, for example) and some such counterfactuals are false (such as “If I were to say ‘abracadabra’ a rabbit would appear”). How then can we evaluate the truth of counterfactual conditionals, given that the truth or falsity of its components is not decisive?

An idea proposed by Nelson Goodman (1947, 1955) and Roderick Chisholm (1946) is to have the truth of a counterfactual conditional depend on both the laws of nature and the background conditions on which they operate. On this account, a counterfactual conditional □→ q is true if and only if there are true laws of nature L and background conditions C which hold, such that p, L, and C communally imply q. (Some further conditions must be met, like that the background conditions must be logically compatible with p.) Obviously, if the laws of nature or the background conditions were different, □→ q might turn out not to be the case.

An alternative way of thinking about counterfactuals called “possible world semantics” was introduced by David K. Lewis (1973a). Lewis’s most important tool is the concept of a possible world. According to Lewis, our actual world is only one among a multitude of possible worlds. A possible world is best thought of as one way (of many) the actual world could have been: all other things being equal, the word “multitude” in the last sentence could have been misspelled, Lewis could never have been born, or atoms could have been made of chocolate. Robert Stalnaker (1968) proposed a similar account but without defending modal realism (that is, realism regarding possible worlds). To him, possible worlds are tools, and as such no more than descriptions of worlds that do not exist.

Some possible worlds are more similar to ours than others. For example, a world which is like ours in every respect except that “multitude” is misspelled in the preceding paragraph is more similar to the actual world than a world with chocolate atoms. In evaluating a counterfactual’s truth value, this fact plays a seminal role. Consider, for example, the sentence “If David had not overslept, he would not have been late for work.” In a world where all vehicles miraculously disappeared that morning, where the floor of David’s bedroom was covered in super strong instant glue, or where the laws of nature suddenly changed so that movement is no longer possible, he would not have made it into work in time, even if he had gotten up early. But these worlds do not interest us; this is clearly not what we mean by saying that had David not overslept, he would have made it in time. To judge whether the counterfactual conditional is true regarding our world, we need to consider only worlds where the laws of nature remain the same and everything else is rather normal—that is, similar to what actually did happen—except for the fact that David did not oversleep (and maybe some minor differences).

Lewis and Stalnaker suggest that an ordering of worlds with respect to similarity to our world is possible. Naturally, worlds where many facts are different from the facts of our world, and worlds with different laws of nature, count as particularly dissimilar. Counterfactual truth can then be determined as follows: of all the possible worlds where p is the case (for short, the p-worlds), some will be q-worlds and others non-q-worlds (that is, worlds where q is true or not true, respectively). To determine whether the counterfactual conditional □→ q is true for our world, we need to check whether the p-worlds that are also q-worlds are more similar to our world than the p-worlds that are non-q-worlds. So to find out whether it is true that David would have gotten to work in time had he not overslept, we look at possible worlds where David did not oversleep and check whether the worlds where he makes it into work are more similar to the actual world than worlds where he does not (because, say, all buses disappear or the floor is sticky).

According to this analysis, the consequent need not be true in all possible worlds (but only in similar p-worlds) in order for a counterfactual to be true. For example, had David overslept in a world where objects can be transported via beaming, he might still have made it to work in time. But as it is doubtful whether this technology will ever be available in our world (as it is not clear whether it is compatible with our laws of nature), the world where beaming has been invented is not relevant for the evaluation of the counterfactual conditional.

Related to what has just been said, we can point out a welcome feature of counterfactual conditionals: it can be true both that if David had not overslept, he would not have been late for work; and that if David had not overslept, yet the bus had had an accident, he would (still) have been late for work. This is a feature that necessary conditionals and mere material implications cannot well accommodate (or only with the undesirable implications that it is impossible for David to oversleep together with the bus having been involved in an accident).

In addition to providing a way of understanding counterfactual conditionals, possible world semantics allows us to spell out the modal notions of necessity and possibility in terms of quantification over possible worlds. Thus, a sentence p is necessarily true (in logical notation: □p) if and only if it is true in all possible worlds. If p is necessarily true, there is no way that p could be false; that is, there is no possible world where p is false. Similarly, p is possibly true (in logical notation: ◊p) if and only if it is true in at least one possible world.

Necessity is thus expressed in terms of universal quantification over (all) possible worlds, whereas possibility is existential quantification over (all) possible worlds. Like the general and existential quantifiers, necessity and possibility, too, are interdefinable: if p is necessary, then it is impossible that non-p, and if p is possible, then it is not necessarily the case that non-p.

Note that there are different sorts of necessity which can be easily accounted for if we conceive of necessity and possibility in terms of quantification over possible worlds: Logical, metaphysical, and nomological necessity can be defined by restricting the scope of worlds over which we quantify. For nomological necessity, for example, we restrict quantification to all and only worlds where our laws of nature hold.

Possible world semantics faces several problems, however. For example, it is unclear just how we can know about what is or is not the case in other possible worlds. How do we gain access to possible worlds that are not our own? However, possible world semantics is a valuable tool for understanding some of the most central issues in Metaphysics of Science, such as dispositions and causation. In addition, necessity is a crucial element in theories of laws of nature, essences, and properties. The modalities of necessity, possibility, and counterfactuality are also important in their own right: after all, knowing what would happen if something else were the case or what can or must happen is key to scientific understanding.

c. Laws of Nature

Here are some intuitions philosophers have about laws of nature: laws are true or idealized, objective, universal statements. Laws of nature support counterfactuals, are confirmable by induction, and are explanatorily valuable as well as essential for predictions and retrodictions. Laws have modal power in that they force certain events to happen or forbid them from occurring. Any analysis of the concept will attempt to account for at least some of these features. Roughly, there are five types of theories of laws of nature: regularity accounts, necessitation accounts, counterfactual accounts, dispositional essentialist accounts, and accounts which take laws to be ontological primitives.

The basic idea of early regularity accounts is that a law of nature is a true, lawlike universal generalization (usually of the form “All F are G,” or in formal notation: ∀x (Fx → Gx)). Whether a given generalization is true is, of course, an empirical matter and must be determined by the sciences, but what it means for a statement to be lawlike is left for metaphysics to define. Not all general statements are lawlike. For example, some general statements state logical truths which clearly are not laws of nature (like “All ravens are ravens”). The main challenge for regularity theories is figuring out what makes a universal statement lawlike without appealing to any sort of connection between events other than regularity.

The Best Systems Account (Lewis 1973a) is an example of a sophisticated regularity theory. It asks us to imagine that all facts about the world are known, such that you know of every space-time point what natural properties are instantiated at it. There are many different ways of systematizing this knowledge by using different sets of generalizations. These generalizations make up competing deductive systems. Defenders of the Best Systems Account hold that a (contingent) generalization is a law of nature if and only if it is a theorem within the best such system. Which system is the best is determined by appeal to certain criteria: simplicity, strength (or informational content), and fit.

The Best Systems Account has been criticized for not taking seriously the intuitions that laws of nature are objective, have explanatory value, and hold with modal force. The Best Systems Account yields regularities, but it does not explain why they obtain. Opponents of regularity theories stress that laws do not merely state what is the case, but enforce or produce what happens.

Necessitation accounts are alternatives to the Best Systems Account that endorse this idea. Such accounts have been proposed by David Armstrong (1983), Fred Dretske (1977), and Michael Tooley (1977). For Armstrong, a law of nature is a necessitation relation N between natural properties. (Armstrong speaks of universals.) For two natural properties to be related by necessitation means that one of them gives rise to and must be accompanied by the other (hence necessitation). To give a coarse-grained example: Coulomb’s law (which states, very roughly, that charges exert forces onto other charges), is a true law statement if and only if necessitation holds between the properties of having a certain charge (C) and exerting a certain force (F): N(C, F).

Necessitation accounts have some advantages over regularity theories. For example, they can more easily allow for uninstantiated laws. But how exactly do we know which properties are related by the necessitation relation, and why should we even assume that it exists? Armstrong argues that necessitation can be experienced insofar as it manifests in causal processes. However, not all laws are causal laws. Defenders of necessitation accounts must work out these issues.

The counterfactuals account focuses on a feature related to necessity, namely, the fact that laws of nature are stable under counterfactual perturbations. For example, that nothing can be accelerated beyond the speed of light is a law of nature because it is a fact that no matter what fantastical interventions we were to devise, we still couldn’t travel faster than the speed of light. Versions of the counterfactuals account of laws of nature have been proposed by James Woodward (1992), John Roberts (2008), and Marc Lange (2009).

A bullet that counterfactual accounts have to bite is that the intuitive order of explanation regarding laws and counterfactuals is upside down: whereas the counterfactual theory of laws says that it is a law that all bodies fall down to earth because it is fundamentally true that “were some arbitrary massive body dropped it would fall,” we intuitively believe that “were we to drop this body it would fall” is true because the law of gravitation holds. In other words, it is more intuitive to hold that the laws of nature support counterfactuals rather than that counterfactuals support the laws.

Another prominent way to account for laws of nature is to appeal to dispositional essentialism. Dispositionalists, like Brian Ellis (2001), Alexander Bird (2007), or Mumford and Anjum (2011), believe that some or even all properties are essentially dispositional. For example, if an object has the property of being electrically charged, that just means that it has the dispositional property of being attracted or repelled by other charged objects nearby. In this sense, the property of being electrically charged is essentially dispositional, because no object is electrically charged unless it is disposed to be attracted or repelled in this way.

Now, if natural properties bestow on their bearers dispositions, then that means it is always true that if something has a given natural property (Px), it also has a certain disposition (Dx) and thus it will manifest in a certain way (Mx), given that the disposition’s corresponding trigger occurs (Sx). (In formal notation: ☐∀x((Px ∧ Sx) → Mx)). This is precisely what many metaphysicians ask of laws: that they bring about or make necessary what happens when something else is the case. Dispositional essentialists thus claim that dispositions ground nomological facts: laws arise from the dispositions things have.

Obviously, the dispositional essentialist account of lawhood hinges on non-trivial premises, which must be evaluated in their own right—for example, the premise that dispositions are basic.

If analyzing lawhood is so complicated an affair that it requires elaborate theories and intricate tools, why not assume that lawhood is conceptually and ontologically primitive—that is, that the concept of lawhood cannot be defined in terms of other concepts, and that it cannot be reduced to underlying phenomena? Tim Maudlin (2007) argues that scientists do not seek to analyze laws, but rather accept their existence for a brute fact in their daily practice, and that philosophers should do likewise.

To Maudlin, a law of nature is that which governs a system’s evolution through time and determines what future states can be produced from the current state of the system. As lawhood is a primitive concept for Maudlin, he attempts to utilize it in defining other notions, like causation and counterfactual truth. Whether Maudlin’s approach is viable or not depends to a large part on whether these definitions of causation and counterfactual dependence by means of laws of nature work out or not.

d. Causation

Causation is obviously intimately connected to the laws of nature, as we would expect at least some laws to govern some causal relationships. Causation, however, is not a straightforward notion. For example, philosophers disagree over which kinds of entities are the proper relata in causal relationships, some potential candidates being substances, properties, facts, or events. There are several approaches to understanding causation: regularity theories, counterfactual theories, transfer theories, and interventionist theories.

Regularity theories follow in the footsteps of David Hume’s treatment of causation. According to regularity theories, all that can be said about causation comes down to stating a regularity in the sequence of events. The motivation for regularity theories stems from the fact that instances of a regularity can be observed, unlike the production of one event by another or a necessary relation between events.

One of the most widely known regularity theories is John Mackie’s INUS account of causation (1965). According to Mackie, an event is a cause if it is an Insufficient but Necessary part of an Unnecessary yet Sufficient condition for the effect to occur. For example, a short circuit (C) alone is not sufficient for a house to burn down (E); there must also be inflammable materials nearby (A) and there must not be sprinklers which extinguish the fire (B). Call this a complex condition (ABC). As the absence of sprinklers and the presence of inflammable materials is not enough to cause a fire, the short circuit is necessary within this complex condition, which is then sufficient for the fire. But there may be other complex events (DFG, HIJ, and so on) which could also bring about the same effect. For example, a lit candle in a dried-up Christmas tree may also cause the house to burn down. As the short-circuit scenario (ABC) is only one of many potential causes of a fire, it is not necessary for the effect to occur, but if it occurs, it is sufficient to bring about the fire.

Like other regularity theories, Mackie’s INUS theory has the disadvantage of classifying as causal some regularly co-occurring coincidences that are, for all we know, not causally related. For illustration, consider a simpler type of regularity theory according to which causation is just regular succession. The problem is that if causation were nothing but regular succession, then we would be forced to say that the rise of consumer goods prices in the late 20th century causes the oceans’ water levels to rise. Obviously, these events coincided but are not causally related.

To forgo this problem, philosophers devised counterfactual theories of causation. The initial idea presented by David K. Lewis (1973b) is to equate causal dependence with counterfactual dependence. The idea seems plausible: had the cause not occurred, there would (all else being equal) not have been the effect. More precisely, for event e to (causally) depend on event c, whether e occurs or not must depend (counterfactually) on whether c occurs or not (that is, on whether both c □→ e and ¬c □→ ¬e are true, where ¬ is the negation operator). For example, if the short circuit is the cause of the fire, then the house would have burned down if the short circuit had occurred, and it would not have burned down if the short circuit had not occurred.

Lewis saw that this initial account is flawed as it yields intuitively incorrect results in so-called pre-emption scenarios. Imagine two people, Suzy and Billy, throwing stones at a bottle. Now picture a situation where if Suzy does not throw her rock, Billy will. Suppose Suzy throws her rock, hits, and the bottle shatters. The effect, namely the shattering of the bottle, is evidently caused by Suzy’s throwing the rock. However, the effect would have occurred even if Suzy hat not thrown, because in that case Billy would have thrown his rock and shattered the bottle. In this scenario, we recognize Suzy’s throw as the cause of the shattering, but the latter does not counterfactually depend on the former (because it is incorrect that had Suzy not thrown, the bottle would not have been shattered).

Although more sophisticated counterfactual theories are more successful in dealing with pre-emption and other problems, some philosophers choose to take a different approach. Proponents of transfer or conserved quantity theories like Salmon (1984, 1994), Phil Dowe (1992), and Max Kistler (2006) claim that causation is best understood as a transfer of a physical quantity from one event to another. For example, Suzy is causally responsible for shattering the bottle (and Billy is not) if it was her energy that set the stone in motion to physically interact with the bottle on impact and shatter it. Transferable quantities include energy, momentum, and charge, for example. These quantities are subject to conservation laws, which means that in any isolated system, the sum total of the remaining and the transferred amount of the quantity will always equal the initial amount.

Transfer theories face difficulties in accounting for negative causation. For instance, omitting to water plants may cause them to wither, but there is no transfer of a conserved quantity from anything to the withering. Other problems derive from examples where the supposed causal relationship is not obviously of a physical nature. For example, we may say that wild speculations at the stock market caused the economy to break down or that Suzy’s throwing Billy a kiss causes him to blush.

Of the fourth group of theories of causation, interventionist theories, James Woodward’s approach (2003) is a prime example. Woodward suggests that causation is best characterized by appeal to intervention. Consider the following example: Testing a drug for efficiency consists in finding out whether a group of people who are administered the drug are cured while a group who does not receive the drug remains uncured. In other words, drug testers intervene by giving the drug to some patients and a placebo to others. If the drug intervention leads to recovery while the placebo intervention does not, the drug is said to be causally relevant for the recovery.

Woodward places further constraints on interventions, one of which is that the intervention (of administering the drug or the placebo, respectively) must be performed in such a way that other potential influences are absent. For example, if the drug were given to healthy and young patients while only the elderly and frail receive the placebo, the test might falsely attribute causal efficacy to the drug.

Even when these precautions are taken, Woodward’s theory is at risk of being circular: the analysis presupposes that we understand beforehand what it means to intervene on a system. Intervention, however, is itself a causal notion. Woodward has clarified that his theory is meant to explicate and enlighten our concept of causation, not to reduce causation to other phenomena.

It seems that all theories of causation face difficulties (either in the form of recalcitrant exemplary cases or in that they do not capture certain features of causation). One possible conclusion to draw from this is that causation is not one unified phenomenon but at least two and potentially many more. For example, Ned Hall (2004) argues that our intuitions characterize causation both as production and counterfactual dependence, and that the problems of analyses of causation can be traced back to the attempt of squeezing these into one unified concept.

The debates over the nature of dispositions, modality, laws of nature, and causation are still ongoing. Many promising approaches have been proposed in their course and will continue to be explored in the future. (For a detailed account of the relation between the debates surrounding dispositions, counterfactuals, laws of nature, and causation in Metaphysics of Science, see Schrenk (2017).)

e. Natural Kinds

In everyday contexts we habitually classify objects or group them together. Some of these groupings seem more natural to us than others. Philosophers who believe that nature comes with her very own classifications speak of “natural kinds.” For example, samples of gold closely resemble each other, differ clearly from other chemical elements, and share a common microstructure, whereas sea life comprises organisms of very different sorts (including crustaceans, fish, and mammals). Terms like “sea life” and “tile-cleaning fluid” are convenient for human purposes such as thinking and talking about groups of things, but we do not expect them to reflect the structure of the natural world (which does not mean that the classifications they introduce are entirely arbitrary). Natural kinds, on the other hand, supposedly “carve nature at the joints” (Plato’s Phaedro 265d–266a). They are also highly projectible: we can inductively infer from the behavior of one object to that of all objects of the same natural kind.

If natural kinds exist and contribute to the structuring of the world, then ideally we want the sciences to discover what natural kinds there are. A natural kind enthusiast may claim that physics tells us that electrons and quarks exist, chemistry says that there are chemical elements like gold (Au) and compounds like water (H2O), and biology seems to suggest that organisms are ordered hierarchically along the lines of family, genus, and species. However, there are also conventionalists who believe that so-called natural kinds are not independent of the minds, theories, and ambitions of human beings, or that no way of dividing up the world is inherently better than any other. To illustrate their claims, they remind us that the concept of biological species used to be regarded a prime example for natural kinds, but that, in the meantime, various paradigms (based on the morphology, interbreeding capacities, or shared ancestry of organisms) have been proposed, each leading to a different system of classifications.

If natural kinds exist in nature, then what are they? What makes a natural kind the kind it is? Different ideas have been proposed and have given rise to a multitude of questions: Do objects which belong to natural kinds share at least some properties? Are these special, “natural” properties? Are natural kinds determined by the roles they play in inductive inferences or laws of nature? Is there a hierarchy of natural kinds, such that some kinds are more fundamental than others?

A position that has been particularly influential in the 20th century is the view that natural kinds have essences. It supposedly follows from Hilary Putnam’s Twin Earth thought experiment (1975). Suppose there is a planet just like Earth in every way, but there is a liquid that the inhabitants of Twin Earth call “water” and which resembles water in every respect except for its microstructure, which is not H2O, but XYZ. Intuitively, Putnam claims, XYZ is not water, which leads him to assume that, unlike the superficial properties of being wet, potable, and so on, being H2O is a necessary condition for being water. Similar conclusions can be drawn from Saul Kripke’s argument that if we were to find out that the color we have up to now associated with elementary gold is actually an illusion, we would all agree that gold remains gold so long as it has atomic number 79, no matter what color it is (1980). (Kripke and Putnam’s primary aim is to show that the meaning of the terms “water” and “gold” comes not from our concepts but is determined by the structure of the world. We must, hence, acquire it a posteriori.)

Linked to but distinct from the question of what natural kinds are is the question of whether natural kinds form an ontological category in their own right, or if they can be reduced to other existents like properties. Realists regarding natural kinds believe that talk of natural kinds and successful inferences presupposes the existence of natural kinds in nature. Reductionists, on the other hand, may argue that membership in natural kinds is not only determined by a number of shared properties, but also that it consists in nothing over and above having these properties.

Unsurprisingly, metaphysicians of science are especially interested in finding out which, if any, natural kinds are postulated or discovered by the various branches of science and whether they really identify as natural kinds by the standards of contemporary metaphysical theories, or whether the theories of natural kinds need to be revised.

f. Reduction, Emergence, Supervenience, and Grounding

The world consists of many different things. Philosophers have always dreamed of rendering it more orderly by systematizing it in just the right way. An important step towards doing so seems to entail an analysis of the relationships and dependencies between things which belong to different strata or levels of reality. The world apparently comes structured in levels, with things on higher levels somehow depending on the things on lower levels. For example, a factory consists of machines, conveyor belts, and so forth; machines are made of various interacting cogs, levers, and wires (which, if left to themselves, cannot fulfill the functions they fulfill within the machine); the cogs are made out of molecules, the molecules are made of atoms, and the atoms are made of protons, neutrons, electrons, and so on. Dependencies like these are studied by the various special sciences. (Note that the idea that science suggests that the world comes structured in levels has been contested by some philosophers (Ladyman et al. 2007, 178).) It is clear, however, that a factory is not composed of machines, conveyor belts, and so on in the same way that an atom consists of particles. Surveying the whole of science, Metaphysics of Science strives to account for the various ways higher level objects depend on lower level entities. The aim is not just to establish what depends on what, but to also clarify and explicate the nature of the dependencies. The kinds of relations most fervently discussed in Metaphysics of Science include reduction, emergence, supervenience, and grounding.

Reduction is often conceived of as a two-place, asymmetrical relationship to the effect that one thing is somehow made of, accounted for, or explained in terms of another thing. Typically, the reduced thing is conceived of as somehow less fundamental or less real, or even considered to be eliminated. Two types of reduction are relevant to Metaphysics of Science. First, there is reduction of one theory to another. For example, is it possible to express some theories of chemistry in terms of physical theories? If so, can all chemical theories be thus reduced? What about biological, psychological, and sociological theories? Second, reduction is sought between different sorts of entities or ontological categories such as phenomena, events, processes, and so on. Potential candidates include reduction of macro-level objects to molecules, atoms, and subatomic particles, reduction of properties to sets of objects which resemble one another, reduction of states of affairs to entities and properties (including relations), and reduction of the mental to the physical. The latter especially has been widely discussed in metaphysics. (Note that the first and second kind of reduction cohere: if reduction of one theory to another succeeds, then ontological reduction of the entities postulated by the former to the entities mentioned in the latter may thereby also be achieved.) For Metaphysics of Science, claims of reduction pertaining to entities postulated by the sciences are of great interest, as are claims regarding reductive relationships between theories and their key concepts.

In a way, then, an armchair is reducible to its constituent parts: the fabric, upholstery, wood, and metal springs. However, an armchair is obviously not the same as a random pile of these materials. Unsurprisingly, philosophers disagree over whether, for particular cases, complete reduction can be achieved or not. For example, how could Bach’s Brandenburg Concerto No. 6 be reduced to its physical properties? Sure, a particular performance depends on the physical movements of the musicians and on how the created soundwaves causally impact on the hearers’ eardrums, but the Concerto is not identical to these physical properties apparent in any given performance of it, as it exists independently of them.

Those who argue that such reductions do not succeed often speak of the irreducible as emergent from the underlying basis. They want point out that although there is a dependence of the higher on the lower level, the higher level adds something novel and can thus not be completely reduced to the lower level. An emergent property or phenomenon cannot be accounted for by reduction, because it is believed not to be a property of any of the component parts, and it is not obviously caused solely by their interplay. For example, whether or not you find abortion morally reprehensible does not seem to depend on the physical facts. Given the same situation, somebody else might pass the opposite moral judgment. Whereas such moral considerations are of no great professional import to the metaphysician of science, emergent properties in the sciences are. For example, biology still struggles to explain why higher forms of life have certain properties like consciousness, aspirations, and phenomenal experiences, which are not obviously properties of the underlying matter.

Reduction and emergence are interlevel relations. The most innocent, weakest dependence relation that is compatible with both reduction and emergence is called supervenience. Some thing A (the so-called supervenience set) is said to supervene on some other thing B (the so-called supervenience base) if and only if there can be no difference in A without there also being a difference in B—or, for short, if there is no A-difference without a B-difference. For example, an oil painting’s macro-properties (A)—what it depicts and how it looks to us—supervene on its microphysical properties (B): unless the location, intensity, or color of the paint blotches are changed, the painting will always look the same to us. To better understand the world, metaphysicians of science research supposed supervenience relations in the sciences.

In the early 21st century, metaphysicians turned their attention to another sort of interlevel relation: grounding relations. Grounding relations are metaphysical relations which establish a special sort of (noncausal) priority of one over the other. Of two propositions or facts which are related by a grounding relation, one is taken to ground, or account for, the other. Grounding is stronger than supervenience, as it amounts not just to the claim that some A-facts only vary when B-facts vary—which may occur coincidentally—but that A-facts vary because B-facts vary. Unlike some forms of reduction, grounding does not seek to eliminate the grounded fact; attributing full existence to both of them, it merely ascribes a more fundamental status to the grounding fact.

Debates over grounding revolve around a number of pivotal questions, such as whether instances of grounding are all of the same kind or whether they embody a number of different relations (which fall under the larger category of grounding relations), whether the grounding relation is primitive or can be analyzed in terms of other relations, and whether it is an irreflexive, asymmetric, and transitive relation or if other properties should be ascribed to it. The answers to these questions may also have an effect on how we should conceive of interlevel relations in the sciences, and the latter are of great interest to metaphysicians of science.

g. Space and Time

To most philosophers interested in the field, Metaphysics of Science is not confined to discussing concepts that pervade the whole of science (as, arguably, law of nature and causation do). It is also concerned with metaphysical questions that arise with respect to the particular sciences, like “What is life?” (biology) or “What is the ontological status of cultures, governments, and money?” (sociology). The philosophy of physics, too, gives rise to many interesting metaphysical questions. Among them are questions regarding the nature of space and time, which have been debated since the early dawn of western philosophy and, in the light of modern-day physics, are still at issue in philosophical debates.

As humans, we perceive space and time as different phenomena with differing properties. Space, as we perceive it, extends in three dimensions, and we can (almost) freely move in any direction. Through the physical forces which act upon our bodies, we are capable of detecting some sorts of motion through space (like when we run or jump) but not others (like Earth’s rotation). Time, on the other hand, has a sort of directedness to it (commonly referred to as “the flow of time”). We cannot linger at a particular moment in time, and we cannot go back to previous times. Entities somehow change yet persist through time.

Metaphysicians of science are interested in these phenomena especially in the light of Albert Einstein’s theories of Special and General Relativity. These theories were proposed in order to make sense of the fact that the speed of light was measured to remain constant regardless of the motion of the light source, whereas the velocities of objects depend on the motion of the object relative to an observer. For example, the speed of a train measured by a stationary observer on the platform is greater than its relative speed with respect to another, slower train that moves in the same direction. The speed of light emitted by a lamp on the train, however, will be the same regardless of whether it is measured by a passenger or a bystander. In popular interpretations, Einstein’s theory of Special Relativity suggests that the problem can be solved by postulating that the three spatial and the one temporal dimension form a continuum by the name of space-time. An astonishing consequence could be this: Different observers are at motion with respect to different objects. Their perception of the present is determined by which information is accessible to them, which in turn is a matter of which light signals reach them at a given moment. Therefore, their individual present, past, and future differ according to their state of motion with respect to other objects. Thus, an objective, observer-independent order of points in time does not exist. This view is often referred to as the block universe view, because everything seems to simply exist conjointly, with no objective past or future. Some philosophers also suggest that, on this view, familiar material things are three-dimensional slices of four-dimensional objects (sometimes called “space-time worms”).

Some philosophers claim that the block universe view is incompatible with presentism (the philosophical position that holds that only what is present exists) and supports eternalism (the view that all events past, present, and future exist). Unfortunately, the latter seems not to correspond to our subjective experiences of time. This poses a genuine dilemma for metaphysicians: should we accept Einstein’s theories and dismiss our subjective experiences, or do we need to reinterpret the remarkably well corroborated theories to accommodate our everyday conceptions of space and time?

More such fascinating questions remain. How is the (perceived) directedness of time and its irreversibility (which manifests as increase of entropy) best explained? Are space and time finite or infinite? Do they exist fundamentally and independently of the objects in them, or does their existence hinge on the existence of those objects? Quite obviously, these are questions on which scientific theories have a bearing, and Metaphysics of Science works towards solutions that are both philosophically rewarding and scientifically tenable.

5. The Methodology of Metaphysics of Science

Although Metaphysics of Science is concerned with the key concepts that figure prominently in science, its methods are not predominately those of the sciences. Apart from referencing scientific results and practices, Metaphysics of Science has a number of argumentative tools at its disposal that do not usually play an explicit role in scientific methodology but are not entirely unscientific either. In science these forms of arguments are implicitly employed to establish hypotheses when the empirical evidence is insufficient (for example, because two theories are equally well supported by the available evidence). Unlike many scientific theories, metaphysical claims often cannot be tested experimentally at all—not because we lack the technological means to do so, but because the very nature of these claims defies empirical confirmation or falsification. Think, for example, of the claim that laws of nature hold across all possible worlds. This is why reference to theoretical virtues, Inferences to the Best Explanation, arguments from indispensability and serviceability, extensional adequacy, and the Canberra Plan method are of great argumentative importance in Metaphysics of Science.

Note that some philosophers—for example, proponents of naturalized metaphysics (as mentioned in section 2b)—may reject all or some of these methodological tools as transcendental or indefensibly a priori. However, the issue is not currently settled among philosophers, and the tools described below remain widely used in contemporary Metaphysics of Science.

a. Theoretical Virtues

In both science and metaphysics, we strive for internally consistent, comprehensive, unambiguous theories which cohere with our accepted beliefs, have an adequately large scope, and so on. Among the various desiderata, explanatory power and simplicity are often accorded a central role. To strive for an explanatorily powerful theory is to demand that a theory must explain a certain number of phenomena which stand in need of explanation, that it does so thoroughly and systematically, and that it is not ad hoc. The value of explanatory power is obvious: explanation (or at the very least, systematization) is the very purpose of any hypothesis. Not so with simplicity. There are many ways a theory can be simpler than its competitors; for example, it may contain fewer variables than another. Usually, the call for simplicity is understood in terms of parsimony. Occam’s Razor, a principle frequently appealed to in this context, says that entities must not be multiplied beyond necessity—that is, if faced with otherwise equally good theories (in terms of their explanatory power, for example), we are to prefer the one that postulates fewer (kinds of) entities. However, it is unclear whether simplicity and the other explanatory virtues are truth conducive or whether they are primarily pragmatic or aesthetic theoretical virtues (which means, for example, that simplicity is preferable because it is easier to work with simple theories or because they are somehow more agreeable).

Although theory choice criteria are certainly at work in everyday reasoning, philosophy, and science—remember that nobody wants a complicated, inconsistent, unclear, shallow, or incomprehensive theory—the application of such criteria is not straightforward: they must be measured and traded off against each other. Unfortunately, there are no shared standards or guidelines on how this should be done. How do we find out which of two theories is simpler or more consistent with the body of already accepted beliefs? How do we know which criterion trumps another? What is more, whereas in science theory choice criteria are interim solutions until a theory can be empirically proven, there is usually no such post hoc test in Metaphysics of Science. For all these reasons, justifying our appeal to theoretical virtues is not a trivial or easy task.

b. Inference to the Best Explanation

Once it has been determined through careful assessment of the theoretical virtues which available theory is the best explanation for a given phenomenon, we tend to infer that it must also be the correct explanation. In most cases, we will then also say that the entities (objects, fields, structures) postulated in the explanatory theory really exist. That is, we apply a so-called Inference to the Best Explanation (often referred to as “IBE”). For example, astronomers found that the best explanation for a divergence in the orbit of Uranus is the existence of another planet, Neptune, whose gravity interferes with Uranus’ trajectory. Thus, they inferred that Neptune must exist. This hypothesis was confirmed when Neptune was later discovered through telescopes. Similarly, many metaphysicians of science believe that IBE can be applied to metaphysical theories. For example, Nancy Cartwright believes that the best explanation for the fact that laboratory results produced in controlled, sterile settings can be applied to the messy circumstances of the outside world is the existence of underlying dispositions that are examined in the laboratory but also pervade the rest of the world, and she therefore accepts this view as true (Cartwright 1992, 47–8).

Quite obviously, IBEs are not deductively valid, and even the best explanations we have at our disposal can later turn out to be incorrect. For example, when astronomers sought to explain anomalies in the orbit of Mercury, they failed to find Vulcan, a planet postulated explicitly for this purpose, and the anomalies were later explained with the help of the General Theory of Relativity.

Note also that Occam’s Razor and IBEs sometimes pull in opposite directions: whereas IBEs often enrich, rather than reduce, our ontology, Occam’s Razor is set on eliminating as many entities as possible from our ontology. On the other hand, one of the marks of a good explanation is that it does not postulate more than is necessary; that is, it is parsimonious in the sense of Occam’s Razor. Either way, even if metaphysicians can agree on using theoretical virtues and IBEs as argumentative tools, there is still room for debate.

c. Indispensability and Serviceability Arguments

In addition to IBEs, metaphysicians appeal to further inferential arguments to the effect that we should accept certain hypotheses as true. More specifically, indispensability and serviceability arguments basically consist in claiming that if X plays a crucial role with respect to Y, and if Y is either uncontroversial or relates to some postulate that we are unwilling to let go, then the existence of X can (or must) be asserted—that is, we should believe that X exists for the sake of Y.

One reason for accepting the existence of an entity X may be that its existence is indispensable for the existence of Y; that is, Y cannot be the case unless X exists. For example, some metaphysicians argue that the existence of mathematical entities is indispensable for science, and as science is important and probably at least approximately true, we have every reason to believe in the existence of numbers (as Platonic objects, say). Very roughly put, indispensability arguments infer from the premise that X is indispensable for Y and the premise that Y is the case to the conclusion that X exists.

(An older variant of the argument from indispensability is the so-called transcendental argument, which usually runs like this: if X is a necessary condition for the possibility of Y, and if we believe that Y is the case, we should also hold that X exists.)

Serviceability arguments are weaker than indispensability arguments. They advise us to accept the existence of a (kind of) entity X if X is serviceable towards end Y. For example, David K. Lewis argues that the assumption that possible worlds are concrete objects (just as our actual world) is highly serviceable (1986, 3): among other things, it provides us with the means to spell out the semantics of counterfactual conditionals. However, there may be other ways of accounting for the truth conditions of counterfactuals (for example, by referring to complete descriptions of fictitious possible worlds instead). Whereas indispensability offers a strong argument for the existence of some sort of entity, serviceability allows for contenders. Different kinds of entities may serve equally well to implement a goal, and serviceability arguments alone may not suffice to determine which of these entities we should believe in.

The evaluation of indispensability and serviceability arguments depends on what you already believe and what goals you pursue (as represented by variable Y). At best, they yield conditional existence claims: if you believe that science is successful and that science would not be successful if it were not for the existence of mathematical entities, then you had better believe in the existence of mathematical entities. If you do not believe that science is successful, then the argument is moot. Awareness of the occurrences of these kinds of arguments within debates in Metaphysics of Science will certainly help you understand your opponent, but it will seldom suffice to settle the issue.

d. Extensional Adequacy and the Canberra Plan

One particularly useful tool in evaluating metaphysical hypotheses is the test for extensional adequacy. To test a theory for extensional adequacy means to examine cases that, according to pretheoretical, intuitive judgment, fall under a concept the theory aims to explicate and to check whether the theory indeed subsumes these cases as instances of the concept. In addition, the theory may be tested with regard to scenarios in which its concepts should intuitively not apply; if the theory (wrongly) applies, it may have to be corrected. For example, suppose someone proposes a metaphysical theory as to what a law of nature is in claiming that a law of nature is nothing but a general statement of the form “All things which have property F also have property G.” This theory will quickly be challenged: “All pigs can fly” is a general statement, but, intuitively, it is not a law of nature, because it is clearly false. Whereas the sentence matches the alleged criterion for lawhood, it is intuitively not a law and thus a counterexample to the proposed analysis of lawhood.

Tests from extensional adequacy presuppose judgments regarding the extension of the concept in question; that is, it presupposes having a strong intuition about which entities or phenomena fall under it or are denoted by it. Preconceptions and intuitions as to what a concept denotes can diverge, however. They may be products of the culture we live in or the way we speak, and professional philosophers’ intuitions may well differ from the preconceptions of the folk.

Understanding a concept is not merely a matter of knowing what it denotes. Usually, concepts also carry meanings, or intensions. The so-called Canberra Plan is a complex two-step method for clarifying both the correct extension and intension of concepts. In other words, the Canberra Plan first seeks to fix the meaning of concepts (intension) by describing the role that instances of a given concept have to fulfill then, second, strives to identify its actual fulfillers (extension). It was proposed by philosophers associated with the Research School of Social Sciences in Canberra (most notably Frank Jackson and David K. Lewis). First, a concept’s use in everyday, scientific, and philosophical contexts is analyzed by collecting all sorts of platitudes about it. A platitude can be anything we say or believe about the concept. For example, regarding causation, we might believe that causes always precede their effects, that nothing causes itself, and so on. By systematizing the platitudes, the Canberra Planners determine which roles the referents of the concept are usually expected to fulfill. In the second step, they then search for referents, that is, entities or phenomena in the world that match these roles. For our example of causation, the transfer of energy could be proposed as such a role player. Because scientific theories are elaborate attempts at describing the world and because Canberra Planners are generally inclined to believe that scientific theories are at least approximately true (that is, they are scientific realists), particular attention is given to the postulates of the sciences. Depending on whether the second step is successful, we may find out the real extension of the concept in question—or we may have to concede that it has no basis in reality and should be discarded. However, note that there are multiple ways of systematizing platitudes and evaluating scientific theories, and hence the outcome may vary.

Apparently, whichever method(s) we employ, there will always be ways to question our claims in Metaphysics of Science (and in philosophy generally). Apart from the proponents of a radical naturalization of metaphysics, philosophers tend to see this not as a fatal flaw but simply as a characteristic feature which is grounded in the very nature of the discipline. The fact that Metaphysics of Science knows no ultimately decisive method but draws on many different tools that may result in different outcomes is not necessarily a bad thing: these tools may just be the best we have to answer questions that we cannot avoid asking, and there may nonetheless be progress in the form of ever more precise, extensionally adequate theories. At the very least, they allow us to map the field of possible views within Metaphysics of Science.

6. References and Further Reading

    • Armstrong, D. M. 1983. What Is a Law of Nature? Cambridge: Cambridge University Press.
      • Argues that laws of nature are necessitation relations between universals.
    • Barcan Marcus, R. 1946. “A Functional Calculus of First Order Based on Strict Implication.” Journal of Symbolic Logic 11: 1-16.
    • Barcan Marcus, R. 1967. “Essentialism in Modal Logic.” Noûs 1: 91-96.
      • Both seminal texts by Barcan Marcus lay the groundwork for formal modal logic and afford later developments like Kripke’s and Putnam’s ideas on direct designation, rigid designation, and essence.
    • Bird, Alexander. 2007. Nature’s Metaphysics. Oxford: Oxford University Press.
      • Develops a dispositional essentialist account of laws of nature according to which laws are grounded in dispositions and turn out to be metaphysically necessary.
    • Carnap, R. 1936. “Testability and Meaning.” Philosophy of Science 3: 419–471 and 4: 1–40.
      • Discusses the simple conditional analysis and proposes the reduction sentences analysis of dispositionality.
    • Carnap, R. 1947. Meaning and Necessity. Chicago: University of Chicago Press.
      • Historically relevant work on the semantics of natural and formal languages which lays the foundations for modal logic.
    • Carrier, M. 2007. “Wege der Wissenschaftsphilosophie im 20. Jahrhundert.” In Wissenschaftstheorie: Ein Studienbuch, edited by A. Bartels and M. Stöckler, 15–44. Paderborn: Mentis.
      • Brief historical introduction to 20th century philosophy of science (in German).
    • Cartwright, N. 1992. “Aristotelian Natures and the Modern Experimental Method.” In Inference, Explanation, and other Frustrations, edited by J. Earman, 44–70. Berkeley: University of California Press.
      • Argues that one cannot make sense of modern experimental method unless one assumes that laws are basically about capacities/dispositions.
    • Chisholm, R. 1946. “The Contrary-to-Fact Conditional.” Mind 55: 289–307.
      • An early attempt at analyzing counterfactual conditionals.
    • Cooper, J. M., ed. 1997. Plato: Complete Works. Indianapolis: Hackett.
      • Collection of English translations of works ascribed to Plato with helpful footnotes and introductory information.
    • Dowe, P. 1992. “Wesley Salmon’s Process Theory of Causality and the Conserved Quantity Theory.” Philosophy of Science 59: 195-216.
      • Criticizes Salmon’s process theory of causality and suggests that a causal theory based on conserved physical quantities should replace it.
    • Dretske, F. 1977. “Laws of Nature.” Philosophy of Science 44: 248–268.
      • Argues that laws of nature are relations between universals.
    • Ellis, Brian. 2001. Scientific Essentialism. Cambridge: Cambridge University Press.
      • Defends the view that the fundamental laws of nature depend on the essential properties of the things on which they are said to operate and that they are metaphysically necessary.
    • Feynman, R. 1967. The Character of Physical Law. Cambridge: MIT Press.
      • A series of lectures discussing several physical laws and analysing their common features, with a focus on mathematical features.
    • Fine, K. 1994. “Essence and Modality.” Philosophical Perspectives 8: 1-16.
      • Criticizes the idea that essence is a special case of metaphysical necessity (and argues that it actually is the other way around) and discusses the relationship between essence and definition.
    • Göhner, J.F., K. Engelhard, and M. Schrenk. 2018. Special Issue: Metaphysics: New Perspectives on Analytic and Naturalised Metaphysics of Science. Journal for General Philosophy of Science 49: 159-241.
      • Addresses various aspects regarding the relationship between metaphysics and science, with a focus on the questions which metaphysical lessons we should learn from linguistics and the social sciences and whether mainstream metaphysical research programmes can have any positive impact on science.
    • Goodman, N. 1947. “The Problem of Counterfactual Conditionals.” Journal of Philosophy 44: 113–128.
      • Examines the problems that face analyses of counterfactual conditionals and attempts a partial definition of counterfactual truth.
    • Goodman, N. 1955. Fact, Fiction, and Forecast. Cambridge: Harvard University Press.
      • Introduces the “new riddle of induction” (grue-problem) and explores the concepts of counterfactual truth and lawhood in order to develop a theory of projection which resolves it.
    • Hall, N. 2004. “Two Concepts of Causation.” In Causation and Counterfactuals, edited by J. Collins, N. Hall, and L. A. Paul, 225–276. Cambridge: MIT Press.
      • Argues that there are two distinct concepts of causation, one of which is best analyzed in terms of dependence, the other in terms of production.
    • Husserl, E. 1970. The Crisis of European Sciences and Transcendental Phenomenology. Evanston: Northwestern University Press.
      • Unfinished classical text in phenomenology originally published in German in 1936, which bemoans the fact that modern science is oblivious to the life-world of humans.
    • Kistler, M. 2006. Causation and Laws of Nature. Oxford: Routledge.
      • Develops and applies a transfer theory of causation.
    • Kripke, S. 1963. “Semantical Considerations on Modal Logic.” Acta Philosophica Fennica 16: 83-94.
      • Gives an exposition of some features of a semantical theory of modal logics.
    • Kripke, S. 1980. Naming and Necessity. Oxford: Blackwell.
      • Argues that the meaning of names is not determined by descriptions and that natural kind terms rigidly designate (that is, that they designate the same natural kind across all possible worlds), thus allowing for a posteriori necessities.
    • Ladyman, J. and D. Ross, D. Spurrett, and J. Collier. 2007. Every Thing Must Go: Metaphysics Naturalized. Oxford: Oxford University Press.
      • Argues for a naturalization of metaphysics by criticizing contemporary analytic metaphysics and develops a scientifically informed structuralist realist metaphysics.
    • Lange, M. 2009. Laws and Lawmakers: Science, Metaphysics, and the Laws of Nature. Oxford: Oxford University Press.
      • Instead of saying that laws support counterfactuals, Lange proposes to reverse the order and say that laws are those generalities that are stable or invariant under counterfactual perturbations.
    • Lewis, D. K. 1973a. Counterfactuals. Oxford: Blackwell.
      • An account of counterfactual conditionals in terms of modal realism. Introduces the Best Systems Account of laws of nature.
    • Lewis, D. K. 1973b. “Causation.” Journal of Philosophy 70: 556–567.
      • Proposes and modifies the counterfactual account of causation in terms of counterfactual dependence.
    • Lewis, D. K. 1986. On the Plurality of Worlds. Oxford: Blackwell.
      • Defends modal realism, which is the view that the actual world is only one of many possible worlds all of which exist, on the basis that it is highly serviceable in solving longstanding philosophical problems.
    • Mackie, J. L. 1965. “Causes and Conditions.” American Philosophical Quarterly 2: 245–264.
      • Proposes the INUS account of causation.
    • Martin, C. B. 1994. “Dispositions and Conditionals.” The Philosophical Quarterly 44: 1–8.
      • Introduces finkish dispositions as a problem for counterfactual analyses of dispositions.
    • Maudlin, T. 2007. The Metaphysics within Physics. Oxford: Oxford University Press.
      • Argues that lawhood is irreducible but can account for causation, counterfactuals, and dispositionality.
    • Mumford, S. and R. L. Anjum. 2011. Getting Causes from Powers. Oxford: Oxford University Press.
      • The authors develop not only a theory of causation based on powers, but also offer a detailed analysis of causal powers themselves.
    • Mumford, S. and M. Tugby. 2013. “What is the Metaphysics of Science?” Metaphysics and Science, Edited by S. Mumford and M. Tugby, 3–26. Oxford: Oxford University Press.
      • Introduction to a collection of state-of-the-art papers on core issues in Metaphysics of Science.
    • Paul, L. A. 2012. “Metaphysics as Modeling: The Handmaiden’s Tale.” Philosophical Studies 160: 1–29.
      • Claims that science and metaphysics of science differ with respect to their respective subject matter, but that there is no categorical difference in method, as both construct theories by building models.
    • Putnam, H. 1975. “The Meaning of ‘Meaning.’” Minnesota Studies in the Philosophy of Science 7: 131–193.
      • Argues for semantic externalism (the claim that the meaning of a term does not determine its extension, which means that the meanings of a word are not determined by the psychological state the speaker is in, but by external factors) using the Twin Earth thought experiment.
    • Quine, W. V. O. 1948. “On What There Is.” In From A Logical Point of View, 1953, 1–19. Cambridge: Harvard University Press.
      • Proposes that ontological commitments can be read off statements or scientific theories by formalizing them in predicate logic and identifying bound variables.
    • Quine, W. V. O. 1951. “Two Dogmas of Empiricism.” In From A Logical Point of View, 1953, 20–46. Cambridge: Harvard University Press.
      • The two dogmas Quine argues against are: (i) that there is a clear distinction between analytically true and synthetically true sentences, and, (ii), that each meaningful sentence faces the tribunal of sense experience on its own for its verification or falsification (rather than holistically in concert with other sentences).
    • Roberts, J. 2008. The Law-Governed Universe. Oxford: Oxford University Press.
      • Introduces the measurability account of laws of nature, which states that lawhood is a role that propositions play rather than a property of facts and that laws guarantee the reliability of methods of measuring natural quantities.
    • Salmon, W. 1984. Scientific Explanation and the Causal Structure of the World. Princeton: Princeton University Press.
      • Develops a causal/mechanical account of explanation which incorporates the idea that causation is best considered a process.
    • Salmon, W. 1994. “Causality without Counterfactuals.” Philosophy of Science 61: 297–312.
      • Agrees with Dowe’s improvement of Salmon’s 1984 theory and also proposes a transfer or conserved quantity theory of causation.
    • Scholz, Oliver R. 2018. “Induktive Metaphysik – Ein vergessenes Kapitel der Metaphysikgeschichte.” In Philosophische Sprache zwischen Tradition und Innovation, edited by D. Hommen and D. Sölch. Frankfurt am Main: Peter Lang.
      • Describes and analyses the historical programme of inductive metaphysics which developed simultaneously with Logical Empiricism.
    • Schrenk, M. 2017. Metaphysics of Science: A Systematic and Historical Introduction. London: Routledge.
      • Comprehensive, easily accessible systematic and historical introduction to Metaphysics of Science including the topics of dispositions, counterfactuals, laws of nature, causation, and dispositional essentialism, as well as information on the origins and methodology of Metaphysics of Science.
    • Schurz. G. 2016. “Patterns of Abductive Inference.” In Springer Handbook of Model-Based Science, edited by L. Magnani. and T. Bertoletti, 151–174. New York: Springer.
      • Analyses the structure of abductive inferences and recommends that metaphysics should make use of such inferences.
    • Stalnaker, R. 1968. “A Theory of Conditionals.” American Philosophical Quarterly 2: 98–112.
      • Uses possible worlds semantics to analyze counterfactual conditionals without a commitment to possible worlds realism.
    • Strawson, P.F. 1959. Individuals: An Essay in Descriptive Metaphysics. New York: Routledge.
      • Distinguishes between descriptive and revisionary metaphysics and examines the relationship between our language and our habit of conceiving of the world in terms of individuals (particulars and persons).
    • Tahko, T.E. 2015. An Introduction to Metametaphysics. Cambridge: Cambridge University Press.
      • Comprehensive and easily accessible introduction to 20th century and current debates about the methodology and epistemology of metaphysics.
    • Tooley, M. 1977. “The Nature of Laws.” Canadian Journal of Philosophy 7: 667–698.
      • Argues that the relations between universals are truth-makers for laws of nature.
    • Williamson, Timothy. 2016. “Abductive Philosophy.” Philosophical Forum, 47 3–4: 263–280.
      • Recommends both ampliative inferences such as abductions (or, nearly synonymous, inferences to the best explanation) and model-building as valuable methodologies not only for the sciences but also for philosophy and metaphysics.
    • Woodward, J. 1992. “Realism about Laws.” Erkenntnis 36: 181–218.
      • Defends the view that the notion of lawfulness is linked to the notion of invariance rather than the notion of necessary connection.
    • Woodward, J. 2003. Making Things Happen: A Theory of Causal Explanation. Oxford: Oxford University Press.
      • Proposes an interventionist theory of causation that analyses causation by appealing to the notion of intervention or manipulation.

Author Information

Julia F. Göhner
Heinrich Heine University
Dusseldorf, Germany

and

Markus Schrenk
Email: markus.schrenk@phil.uni-duesseldorf.de
Heinrich Heine University
Dusseldorf, Germany

Language in Classical Chinese Philosophy

At first glance, early Chinese thought as expressed in Warring States period (475-221 BCE) texts does not seem to focus on the kinds of questions about language that one might expect from philosophers working on “the philosophy of language.”  This does not mean, however, that language is philosophically insignificant to early Chinese thinkers.  But it does show that discussions of language in these texts are part of early Chinese authors’ engagement with a larger set of philosophical problems, particularly the problem of self-cultivation.  Here, “self-cultivation” means a set of generalized practices directed toward the goal of moral action, focusing on the development of a set of virtues and norms as they relate to the individual as well as progressively higher units of social organization.  Although positions on self-cultivation differ widely across strands of early Chinese thought, a common goal of all competing traditions is the rehabilitation of human conduct.  Discourse about appropriate “models” (fa 法) for such rehabilitation – whether they be concrete tools, exemplary individuals or abstract ideas – is found in all early Chinese philosophical texts.  This, then, raises the issue of language: how does the sage (shengren 聖), as one who has successfully mastered exercises of self-cultivation and thus furnishes us with the requisite fa, speak? Or, as some traditions ask, does the sage speak at all?  Do words promote or impede an individual’s development, and is the sage’s insight an ineffable experience or is it one that can, and should, be articulated for the benefit of others?  Thus, the problem of self-cultivation functions as a stage for various other intersecting concerns into human nature, the relation between human feelings and thought or judgment, the ideal social and political organization, and the relation between the human subject and the larger processes of nature and the cosmos, among other topics.  Discussions of the linguistic dimensions of sagehood then generate other questions about language:  How do words relate to psychological states? Is language a constitutive element of human nature, or is it a conventional practice that stands in a particular orientation to a naturally given state? Is language inherently tied to the incidence of social and political chaos, or is it a technology that can be used to institute order?   This entry offers a brief overview of how inquiries concerning language are developed in classical Confucian, Mohist, and Daoist writings.

Table of Contents

  1. Key Terms and Problems
  2. Speech (yan 言) as Virtuous Conduct (xing 行) in the Analects
  3. Language and Self-Cultivation in the Mencius
  4. Zhengming 正名 in the Xunzi
  5. The Mohist Canons
  6. ‘Not Speaking’ in the Daodejing
  7. ‘Goblet Words’ in the Zhuangzi
  8. Additional Trends
  9. References and Further Reading

1. Key Terms and Problems

Contemporary debates on language in Chinese philosophy, in the analytic tradition, have been determined to a large extent by the research of Graham (1989, 1978) and Hansen (1983) on the linguistic models displayed in the Mohist Canons. Harbsmeier (1989b, 1991), Mou (1999), Fraser (2007) and Robins (2000) represent a selection of scholars who have extended the inquiry into the grammatical and syntactical structures in the Canons by further developing some of the central theses put forward by Graham and Hansen, such as those concerning the use of word-types (like mass-nouns) and structures of predication. An enduring premise in this approach is the clear distinction between language (variously construed as speech/yan言 and names/ming名) and the reality (shi 實, literally, ‘objects’ or ‘solids’) with which it shares a formal, representational relationship.

Another trend in inquiries concerning language involves a less formal approach, replacing the focus on referential structures with an analysis that identifies language as part of an embodied, empirical model of experience. Geaney (2010, 2002), for instance, argues that conceptions of language in early China cannot be grasped without appreciating the larger perceptual index of sight and sound of which ‘names’ and ‘speech’ are a constitutive element. Wagner (2003) similarly underscores how conceptions of ming in early linguistic models (like that of Wang Bi) define speech in terms of aurality, with ‘names’ being understood as meaningful units of sound. Lewis (1999) calls for situating language somewhere between a purely oral, and thus aural, dimension and a written technology that serves as a more robust medium for recording and articulating judgments.

Alternate directions in the literature display a different set of concerns, foregrounding the socio-political applications of a theory of language. In this latter approach, conceptions of language are often perceived as being coextensive with a conception of culture. We find, as a result, numerous schools attempting to furnish an account of how culture is to be distinguished from a natural state, and how ‘names’ or ‘speech’ fit in relative to this distinction. Multiple accounts of this distinction—as either oppositional, as a continuum, as unconnected—lead to diverse possibilities for conceiving language as a spectrum that displays a naturalist bias at one extreme and a social normative agenda at the other.

Whether we choose to capture the discussions of language in classical Chinese philosophy with a referential model that focuses on predicate logic, a perception-based model of the senses, or a more expansive understanding of language as a socio-political technology, a basic vocabulary emerges across a wide selection of texts that ties the question of language to the larger problem of how one can know the world and provide an articulate judgment of one’s experience in it. Early Chinese accounts of language are intimately bound up with how one discriminates (bian 辨) one thing from another, categorizing the world accordingly in terms of what ‘is so’ (shi 是) and what is ‘not so’ (fei 非). This dialectical capacity for division separates things both on the descriptive as well as normative registers, and thus built into the ascription of something as ‘so’ is the clear sense that it ought to be so. Chris Fraser describes these dual senses of the distinction between shi and fei as follows:

They [shi and fei] apply both to the descriptive, empirical question of whether or not something is a certain kind of thing and the normative question of whether some action or practice is morally right or wrong. In effect, shi and fei refer to a very basic, general normative status that does not distinguish between the different flavors of correctness and error implicated in describing, commanding, recommending, permitting, or choosing . . . Because of their normative use, they are seen as inherently evaluative terms with action-guiding force. In ethical contexts, this feature is obvious, as shi-fei distinctions articulate values. Even in nonethical contexts, however, the attitude of deeming something shi or fei is regarded as action guiding.

A recurrent theme that we accordingly encounter in pre-Han texts concerns the relation of names (ming) to how one discriminates and orders one’s categories. What is a name (ming) in relation that which is so (shi)? Is the negation of a thing by pointing to what it is not (fei) the opposite of a given name in that context? And how does the normative dimension of the model of bian affect the use of names along distinctions between shi and fei? As we see later in the article, these are all problems concerning language and epistemology that emerge as points of contention between the various competing schools of classical China.

2. Speech (yan 言) as Virtuous Conduct (xing 行) in the Analects

Concerns with language in Confucius’ Analects come to rest squarely within the text’s overarching composition of a program of self-cultivation. Names (ming 名) and the activity of speaking (yan 言), broadly construed in both a nominal and verbal sense, therefore do not present the reader with the kind of problematic that requires establishing a logical relation between mental content (as determining the ‘meaning’ of a word) and the world as a given, objective correlate. Rather, the salient question the text repeatedly poses is how to use words and speak in general such that one’s linguistic comportment can coincide with one’s character as a virtuous person. A direct consequence of aligning the question of language along these lines is to be seen in frequent discussions in the Analects where both the style of a person’s speech (its elocutionary attributes, such as tempo and diction) as well as its content emerge as useful measures of moral development. The Master is thus concerned with whether one’s words are sincere (xin 信) and unequivocally identifies “clever or cunning speech” (qiao yan巧言) (Analects, 1.3) with the absence of ren or virtue, as it is broadly construed in this text.

There exists in the Analects, then, no sense of the inherent value of words as signifiers of an external reality. Rather, language is analyzed as a philosophical problem only in relation to the viability of a virtue-based ethics, and its efficacy is to be judged in its successful subordination to, and implementation of, a model of virtuous conduct (xing 行) (see Analects, 9.24). At one end of this spectrum, the Master invokes the rhetorically powerful example of the Ancients, who remain silent out of fear that their actions will not match their words (see 4.22). But of more use is the model of the ‘nobleperson’ or junzi 君子, who displays a flawless calibration of words to action. Scattered across the text, the majority of discussions regarding the nature and use of words comes to settle on the need to emulate the linguistic perspicuity exhibited by this ideal type. The junzi speak with sincerity (xin 信) (see 1.7) and their use of language is repeatedly described as careful (see 1.14), slow (12.3) and always bound by the larger concerns with virtuous conduct (2.13, 4.24).

The capacity to undermine the Confucian art of self-cultivation through a gross misuse of language emerges as a necessary corollary to the conceptual bind the text forges between one’s speech (yan 言) and conduct (xing 行). While all who are virtuous speak in accordance with their character, it is not the case that all who speak are necessarily virtuous (see 14.4). Language can then serve equally as a marker of both moral health as well as moral decrepitude. It is this basic observation that underlies a central Confucian conviction that the health of a society, and its apex political institutions, can be achieved through the practice of zhengming 正名, or ‘correcting names.’ While this is an overt concern and stated objective in the Xunzi, the Analects underscores the important role that zhengming plays in a famous passage that links socio-political disorder ultimately with a state of linguistic disorder (see 13.3). If names (ming ) in their specific designation refer not to discrete objective correlates (‘son,’ ‘father’ as neutral, discrete units) but rather to how one must act in relation to the roles associated with such names (to ‘be a son,’ to ‘be a father’), then a state of linguistic disorder is one in which the designation of behavioral norms implied in the use of names no longer works or implies a failure of these norms. Where the performative designations of our names are not properly understood, socio-political chaos must necessarily reign. The Analects thus points in the direction of a prescriptive theory of language in its brief formulation of a program of zhengming, which involves the rehabilitation of such a comprised language and its social and political ill-effects.

3. Language and Self-Cultivation in the Mencius

In the Mencius, the Confucian program of self-cultivation is given further conceptual depth to the extent that a more robust metaphysics of human nature (ren xing 人性) anchors the entire project. The text organizes its discussions of language with particular attention to its overriding concerns with the nature and development of the heart (xin 心) and the attainment of a kind of moral animation in the human subject, which it describes in Mencius 2A2 as having a “flood like qi” (hao ran zhi qi 浩然之氣). In other words, the imperative in the Mencius is not simply to secure a complementary organization of language (yan 言) and virtuous conduct (xing 行), as we have seen in the Analects. The text adds depth to this generalized formulation of language by integrating the question of how to use words with its more intricate moral psychology of the heart and human nature. One appreciates the implications of this move in the naturalized status that extends to language itself. For instance, Mencius 4A15 establishes a parity between certain basic physical attributes, like the pupils of a person’s eyes, and the kind of language they speak. Crucially, these attributes—one anatomical, another linguistic—function as potent markers of a more fundamental moral signature of human nature. Thus, if the inherently moral capacities of being human are to be realized, the text points to both one’s pupils as well as one’s words as the natural markers of moral development.

The position the Mencius takes on the status and role of language is, however, not so straightforward if we consider two basic paradigms in the text that bring everything into moral orientation. The first of these models is that of the ‘nobleperson’ or junzi 君子, who is able to grow the “four (moral) sprouts” (si duan 四端) of the heart and successfully master the virtuous conducts of benevolence (ren 仁), ritual propriety (li 禮), righteousness (yi 義) and knowledge (zhi 知). Such a perfected moral state, while it manifests in the junzi’s physical comportment, remains wordless (bu yan不言, Mencius 7A21). At the cosmological level, the text is emphatic about the silence of Heaven (tian 天), whose commandments, which remain unarticulated, can be gleaned only from the evidence of the King’s conduct and the people’s acceptance (see Mengzi 5A5).

However, it is between the poles of silence and grandiose speech that the Mencius affirms the efficacy and value of language. While it describes the junzi as effecting a wordless practice, the text simultaneously upholds speech that is simple and concise (compare Mencius 7B32, 4B15). The overarching framework of ren xing, furthermore, supplies the authors with a standard for truth or genuineness such that speech that complements the natural development of virtuous conduct is positively upheld as corresponding with the reality (shi 實) of things (Mengzi 4B17). A corollary to a genuine/natural language is the potentially false modality of speech, and the Mencius explicitly participates in this arbitration between truth and falsity by rejecting what it terms as “one-sided” and “perverse” speech (see Mencius 2A2, 3B9). Here we are presented with an important dimension to the linguistic philosophy of the Mencius in its thematization of the activity of disputation, or bian 辯, a dialectical framework of language characterized by the eristic exchanges between various parties to a debate. Words in this context admit either to being true or false, and the text explicitly stakes its claims by rendering the principles of competing schools, like those of Yang Zhu and Mo Di, as “one-sided” and “perverse.” Yet, measures of truth and falsity in the Mencius, it bears repeating, do not function in relation to an objective, neutral external world. Rather, the performative dimension of self-cultivation remains the basic conceptual frame. To speak truly and genuinely, in a way that corresponds to the reality of things, implies that such words are distinguished primarily by their virtuous quality. The perversity of the speech of adversaries, like Yang Zhu and Mo Di, is a problem precisely because of the potential of such misguided language to draw society down into a bestial condition, where the genuine principles of benevolence and righteousness are nowhere to be seen (Mencius 3B9).

4. Zhengming 正名 in the Xunzi

Xunzi’s philosophy revolves around the central premise that one’s humanity can be successfully shaped only through concerted effort within the institutional frameworks of education and ritual. A conceptual locus in the text is accordingly represented by the concept of wei偽, ‘deliberate effort,’ a model of virtuous conduct that involves the concerted implementation of institutionally mandated practices. Xunzi’s often cited constructivism is thus to be distinguished from the Mencian belief in the continuity between nature (xing 性) and institutions, the latter being mechanisms by means of which natural dispositions, as positive traits already present in an individual, can be fully actualized. Nature and nurture for Xunzi are not complementary as they are for Mencius, and the former’s claim that “human nature is evil” (xing e 性惡) implies that the work of nurture is a focused undoing or rectification of a naturally undesirable configuration of elements in an individual. The notion of wei偽 therefore implies a concerted level of intervention in natural processes and patterns, denoting an activity that is distinguished by its levels of artifice rather than spontaneity.

Xunzi’s concern with establishing right order, then, does not extend to achieving a harmonious state prescribed in nature, but instead refers to appropriately functioning conventions of society and politics. It is within this overall context of assumptions regarding nature and the institutions that are necessary for a society’s ordered existence that the question of language proves to be of pivotal importance in the text. Names (ming 名) in the Xunzi are a technology through which the undesirable traits of human nature can both be expressed as well as curtailed. As the text states, names have neither “innate appropriateness” (gu yi固宜), nor do they admit to any “intrinsic reality” (gu shi固實). Yet, there are those which are “intrinsically good” ([ming you] gu shan 名有固善). Xunzi thus frees language from any problematic tie with nature since words share no constitutive bond with xing 性, a state that, in turn, is described as “evil,” e 惡. At the same time, however, they are potential markers of virtuous conduct, and it is successfully utilizing this potential of language to rehabilitate society that constitutes a central aim of the text.

The chapter entitled Zheng Ming 正名, “Correcting Names,” details the Xunzi’s intricate treatment of language in both its calamitous as well as remedial versions. The text begins by attributing a significant source of disorder in society to a particular linguistic condition, which it associates with a series of flawed acts like “splitting names,” “making up new names,” and “throwing into disorder established names.” What comes in for censure here is, in essence, the relativism of standards provoked by the competing theories of the Mohists and other camps like the School of Names (Ming jia 名家). The text diagnoses as deplorable a situation in which each school articulates a ‘name’ for itself, evaluating and discriminating reality on the basis of a set of purely subjective observations. One’s ability to understand and negotiate reality (shi 實), according to the Xunzi, depends on the quality of our names or ming 名 (broadly construed to include categories and distinctions) made in language. Where numerous distinctions crowd around the same reality (be it an object, a relation, a character, a role, and so forth), the designation between ming 名 and shi 實 breaks down to result in chaos and confusion.

How, then, does one go about “correcting names”? The text upholds its Confucian commitment to tradition, adapting its conservatism, however, to the specific task of rehabilitating the linguistic standards perfected and fixed by the previous generations of kings. These are the “common names” (san ming 散名), which exhibit a clarity of designation between ‘names’ and ‘reality’ that must be modeled if the disorder that prevails in society is to be corrected. The Xunzi elaborates a nuanced framework to explain this positive linguistic model, explaining the origin of ‘correct’ names in relation to other aspects of an individual’s physical, psychological and epistemic experience, and, in this respect, arguably makes its most significant contribution regarding questions of language. What the sage, like the true kings of the past, is able to successfully identify is the evolution of a given experience through its various stages of development: starting with the elemental origins in the senses; the psychological shaping of such sensory stimuli in feelings/dispositions or qing 情; and the overall understanding or knowledge (zhi 知) of the heart that is able to make sense of and correctly judge the entire process as it unfolds. Sages display a mastery over this entire psycho-physical complex, and their acute zhi 知 enables them to identify which things involve a similar sensory experience and evoke corresponding, similar dispositions, and which things must be accordingly distinguished as generating divergent stimuli and responses. This perspicacity leads to the correct designations in language, where each set of names exhibits a careful sorting of accumulated sensory and psychological data with the constant inflow of new experiences. It is this sorting activity at the level of names that constitutes, in the most rudimentary sense, the deliberate effort (wei 偽) that the Xunzi praises in the work of sages and the larger institutional frameworks of education and ritual. The implementation of zheng ming obviates the proliferation of multiple standards and classes of things by which people can judge their reality. To ‘correct names,’ then, is, first and foremost, to safeguard a society from the scourge of relativism. The text accordingly recommends the king to regulate definitions of names in order that his citizens clearly understand the meanings and referents of words that are in use. Ming 名and shi 實 are thereby harmonized, such that the relations between words and their referents are made plainly manifest and are agreed upon in the social and political conventions through which language is put to use. Zhengming is thus primarily about the social and political benefits to be gained from using language in a particular mode. As the text affirms in its advice to kings, correcting names equips the people with a unified intention and enables them, ultimately, to follow the law. This is the only path to good and successful governance.

5. The Mohist Canons

The short tracts of text that comprise the Mohist Canons as well as the longer work of the Mozi offer a series of dense statements on the nature of language. The Canons in particular put forward a theoretical framework that establishes standards for making true statements and engaging in clear and effective communication. As scholars have often suggested, the Canons are remarkable for the technical nature of their discussions on names (ming), on the relation between names and the reality of objects (shi), and on the epistemic status of our language. Yet, there is an unmistakable sense that the text remains bound to the narrow objective of establishing a sound theory of language for the purposes of defining the basic tenets of Mohist doctrine. A general frame for these inquiries into the nature and the proper use of names is therefore the model of ‘debate’ or bian, which is explicitly thematized in the Canons as the guiding activity through which the proper dao (as envisioned by the Mohists) can be codified and defended. The text defines bian as “contending over claims which are the converse of each other” and continues to state that “winning in disputation is fitting the fact.” Claims which are, in a bian-type exchange, the “converse” of each other are, as we have already seen, the dichotomy of claiming one thing to be so (shi) and another to be not-so (fei). The Mohist is emphatic on the factual nature of this distinction, explicitly marking out the categories of shi and fei as either fitting with reality or not, and the Canons equip the practitioner with the requisite tools and knowledge with which to master this art of discrimination and to articulate the true and correct picture of Mohist doctrine.

We should thus read the Canons as, first and foremost, a text that expounds a dialectical model equipping a speaker to clearly distinguish what is so or right (shi) from what is not so or wrong (fei). As a manual of argumentation or debate (bian), it accordingly inquires into the fundamental laws governing names (ming 名) and their referencing of objects/reality (shi), and discusses more complex problems surrounding the nature of evidence in arguments, the relation between sentences and a speaker’s thoughts, the uses of analogy, and the methods of illustrating, matching, adducing, and inferring (to name but a few of the themes covered).

At the heart of the diverse discussions on language in the Canons lies what Angus Graham has called a “radically nominalist approach to naming.” Such a model does not admit a premise of essences at work in language, whereby a name for a thing might be understood as referencing a core, defining idea that transcends all particular instantiations. To categorize something as ‘this’ or as ‘so’ (shi), and to extend that category to a ming or ‘name,’ is to simply pick out one thing among others and identify it as what it is called. “[T]here is no ‘essence’,” as Graham suggests, “merely the existence (you 有) of the thing with all its properties.”

The nominalism of the Canons does not, however, commit the Mohist to a relativistic view on truth or to a skepticism regarding the epistemic status of names. A central objective of the text in this respect is the identification of the correct procedures for relating names to objects so that language can be used consistently and correctly. The Canons thus articulate a larger epistemological framework by presenting specific sources of knowledge and identifying specific objects of knowledge that allow for a more structured and nuanced discussion of how names are engendered and the various orders of meaning they convey. Knowledge (zhi 知) can be obtained “by hearsay [report], by explanation, and by personal experience [observation]” and its specific objects are “names (ming), objects (shi), how to relate [an object to a name], and how to act.” We find here a basic set of premises shared by the Confucians—namely, that distinguishing between objects using names, and being able to successfully apply the correct names (that is, relate names to objects) produces knowledge and has the effect of guiding one’s actions. Yet, while the Confucian paradigm, as we have seen it on display in Analects 13.3 and in the Xunzi, sets about rectifying the reality of behavior and conduct so as to rehabilitate the correct norms codified in language, the Mohist Canons are emphatic on the need to grasp the act of naming itself. The name (ming), in other words, functions as a definition of the thing (shi), and in doing so denotes its reality.

At the heart of the Canons, then, lies a basic set of premises regarding how to discriminate between the names for various things based on more subtle distinctions between the various kinds or classes of names and referents. Thus, for example, Canon A78 identifies three classes of ming that align with the kinds of referents they point to:

Names. Unrestricted; Classifying; Private.

‘Thing’ (wu 物) is ‘unrestricted’; any object necessarily requires this name. Naming something ‘horse’ is ‘classifying’; for ‘like the object’ we necessarily use this name. Naming someone ‘Jack’ is ‘private’; this name stays confined in this object.

Unrestricted (da 達) names, covering the largest class or kind, have a general scope of designation (like the name, thing/wu 物); then there are class (lei 類) names, which refer to particular kinds/classes of things and are thus limited in scope (like horse/ma 馬); finally, there are personal or private (si 私) names, which are singular in reference (like a proper noun, Jack). That this typology functions on the basis of an underlying ontology of sameness and difference is evident in the logic which drives us from using one type of name to another. Between the word ‘thing’ and ‘horse,’ we have separated out members and distinguished one kind of ‘thing’ from others with which it does not share defining traits. A horse is not a hammer, and thus can be distinguished by a name that marks both its difference from other things (hammers) and its sameness with others (other horses). The Canons appear to take for granted the idea that the reality of objects (shi 實) is divided along such natural classes of sameness and difference, and names, as definitions of this reality, correspond to and express these divisions of classes as given facts that are observable in one’s experience.

The act of speaking (yan 言), then, is a dynamic composite of naming, where a directed intention on the part of the speaker to convey some idea or thought (yi 意) leads to an explicit choice of naming in relation to reality. This act of referring (ju 舉) is an integral moment of the speech act, which the Canons define as “picking out an object from among others by means of its name.” To refer, furthermore, “is to present the analogue for the object” and every reference therefore is an act of setting up an “archetype” (ni 擬) which the chosen name evokes as a meaningful standard (fa 法). Speaking (yan 言) is described as an “emergence of references” (chu ju出舉), a linking up of various names that evoke models or archetypes that all speakers are in possession of. Thus, in addition to the premise that there are different kinds of names (based upon sameness and difference, for example), the Canons also appear to assume the role that convention plays through mutually agreed upon standards or archetypical referents for the names shared among a linguistic community.

6. ‘Not Speaking’ in the Daodejing

The canonical texts of early Daoism also question the role and status of language in relation to an ideal of self-cultivation that is set up as a prime objective to be achieved. However, in sharp contrast to the constructivist tendencies of Confucian discourses, texts like the Daodejing and Zhuangzi explicitly reject the idea that language can be optimally regulated in and through institutional frameworks and conventional practices. There is, moreover, a thoroughgoing suspicion that pervades these texts regarding the value of language in general, and we repeatedly encounter the claim that linguistic expression, in its very constitution, is ridden with epistemic poverty (insofar as words do not attain any true standards for knowledge). This leads to a more extreme position, often cited by scholars in both the Daodejing and Zhuangzi, that rejects language, as such, as a medium of expression. Harmonization with dao, the focus of self-cultivation, is thus understood to be a distinctly extra-linguistic experience.

The Daodejing makes its case for the ineffable quality of a practice of self-cultivation by describing the sage repeatedly as one who does not speak. Daodejing 56 emphasizes in this regard the inversely proportional relation between knowledge and speaking, where “one who understands [dao] does not speak” and one who has no understanding whatsoever has much to say (zhi zhe bu yan, yan zhe bu zhi 知者不言,言者不知). As a categorical rebuke of the Confucian faith in institutional practice and of the conceptual locus established by the notion of deliberate effort (wei偽) in texts like the Xunzi, the Daodejing extols the model of “non (or non-coerced) action” (wu wei 無為). Sages, in other words, must abandon the strictures that come down by way of conventional standards, habits, cultures of education, and other institutionalized patterns of behavior and conduct. Acting without acting, then, is to divest oneself of the social mores that, in a Confucian practice, are pivotal to the successful implementation of a program of self-cultivation. The text appears to suggest that such sagacity entails a termination of speech, as we learn in Daodejing 2, which describes how sages who excel in the affairs of non-action “practice the teaching that is without words” (xing buy an zhi jiao行不言之教).

And yet, the irony, if not the outright contradiction, of an argument that claims the inadequacy of language that is itself put in words is not lost on the authors of the Daodejing. To use language to extol a condition that appears, on the face of it, to be extra-linguistic therefore suggests a more nuanced perspective that these authors hold. We find, for instance, an additional set of claims in the text that uphold a certain kind of speech, and which positively describe words of the sage that mirror the spontaneous patterns of the dao. The ontology captured by the character ziran 自然, the ‘self-so-ing’ essence of dao that manifests in diverse cycles of change and natural progression, finds expression in a particular modality of speech in which words match the fluidity of nature. Rather than a state of complete and total aphasia (the speechlessness that, for example, defines the Pyrrhonian skeptic), the art of wuwei involves a perspicuous and measured operation of language. The Daodejing does in fact describe positive linguistic traits to be modeled, like words that are “trustworthy” (信, Daodejing 8) and that are “lacking in that which can be blamed” ([善言]無瑕讁, Daodejing 27). The text even identifies certain standards by which the reliability of speech can be judged, stating in Daodejing 81, for instance, that “trustworthy words are not beautiful” (信言不美). The sage who acts without acting, then, also speaks without speaking. As a linguistic complement to its model of wuwei, the Daoejing, rather than eliding language completely from its agenda, recommends a certain modulation of speech whereby the errors in how we utilize language might be removed and its potential to express the patterns of dao might be affirmed.

7. ‘Goblet Words’ in the Zhuangzi

While it retains the core themes of the Daodejing, the Zhuangzi elevates its criticism of Confucian and Mohist discourse and dismantles, in a spectacular fashion, the fundamental structures of dialectical speech that underlie both philosophical positions. The authors of the Inner Chapters (Neipian 內篇) build, in this respect, an elaborate critique of argumentation [or disputation] (bian 辯) —a genre of thinking and speaking that is defined by eristic speech, which, as we have seen, pivots on the choice of arguing for one alternative over another. The Qiwulun, the second of the Inner Chapters, evaluates the tenability of such a basic kind of dialectical exchange it associates with the debates of the Confucians and Mohists, where each party argues for its set of claims as true and as constituting a body of knowledge, and correspondingly associates the opposing party’s claims with falsity. The linguistic structure underpinning all such eristic speech is represented by the clear distinction between a positive ascription of what is the case (conveyed by the character shi 是) and a negative attribution using the character fei 非 to reference all that is not. In sharp disagreement with the linguistic models of texts like the Mozi and the Mohist Canons, the Zhuangzi associates this dichotomy of shi-fei claims—of what is and is not so, of what is right and wrong—with a vocabulary of artifice and inflexibility.

夫道未始有封,言未始有常,為是而有畛也。

The way has never had borders; speech has never had any regularity. Make claims about what is so, or what is right, and there are boundaries.

The method of defining what is so, as we read here, consists literally in a making of a definition (conveyed by the characters wei shì為是), where the artifice of a fixed category stands in direct contrast to the processual nature of experience that is dao. Furthermore, dividing language in terms of strict labels, standards or categories continually eludes the reality of dao and only serves to delude an individual with false standards for knowledge. Bian 辯, owing to the very nature of sophistical speech, therefore endlessly carries on and, as per the diagnosis of the Zhuangzi, serves only to wear out the heart-mind (xin 心).

Yet, in analogous fashion to the Daodejing, the Zhuangzi does not recommend an indiscriminate abandoning of all speech. The exemplary model of the sage not only speaks, but does so in a language that, in fact, occasionally spills into the genre of dialectics.

物無非彼,物無非是。自彼則不見,自知則知之。故曰:彼出於是,是亦因彼。

Of things, there are none that are not ‘that’ (bi 彼); of things there are none that are not ‘this’ (shi 是); One cannot see a thing if one approaches it as ‘that,’ one knows it as ‘this’ only as it is known to oneself. Thus it is said: ‘That’ emerges from ‘this,’ ‘this’ follows from ‘that.’

. . . 為是不用而寓諸庸…因是已。已而不知其然,謂之道。

[The sage] does not use a [fixed] definition of what is the case (wei shi為是) but instead lodges it in the usual . . . This is to judge what is so on a given basis (yin shi因是) and stop. Stopping without knowing (bu zhi不知) it to be so, this is called dao.

Unlike the rhetorical ploys and logic-chopping inherent to the activity of bian 辯, the generation of categories in the sage’s dialectic is fluid and perpetually under revision. A key insight in the Zhuangzi thus relates to the inescapability of linguistic expression and the corresponding need to constantly modulate our categories so they can adapt to shifting perspectives and contexts.

The text articulates this positively appraised framework of language using the metaphor of “goblet words” (zhiyan 卮言), a class of speech that is set apart from the ordinary use of language. While the latter functions through a stable matrix of ascriptions and designations between words and reality, the image of the goblet serves the purpose here of emphasizing a thorough dynamism in the way that words can be deployed. Like a goblet that continually overflows only to be filled again with water, the Zhuangzi perceives of a transformative speech that similarly ‘overflows’ each act of categorization or definition. Language, in such a figuration, enables a speaker to express multiple possibilities of experience, and it takes on a varied and rich descriptive quality that, as the text states, “harmonizes with the natural” (he yi tian ni 和以天倪). In sharp contrast to the Confucian agenda of zhengming, which strives toward instituting a catalog of names deemed to be singular and fixed in their denotations, the goblet language of the Zhuangzi is forever under revision, accumulating ever more shades and textures to our names so they may correspond to the self-so-ing (ziran 自然) ontology of dao.

8. Additional Trends

There are of course additional texts and trends, both in pre-Han Chinese literature and in later literary traditions, that further illuminate the line of inquiry that has been introduced here. One body of work that offers ample opportunity for further research is the corpus of excavated materials that has yet to receive an in-depth treatment focusing on the themes and problems of language. Two texts, the Tai Yi Sheng Shui 《太一生水名》and Heng Xian《恆先》, for example, identify a set of positions on names (ming) as part of larger cosmogonic models. In the case of the Tai Yi Sheng Shui text, the problem of naming is specifically related to a cosmogonic account in which an underlying structure of binary pairings governs the nature and use of names. The text articulates the question of language, in other words, in relation to an account of genesis, and the potential of names (ming 名) is rendered in their ability to either maintain or upset a generative structure that is understood to subtend all things. This imbrication of cosmogony and language, moreover, points explicitly to the role of cultivation that we have identified as deeply connected to the question of language in classical Chinese accounts. The regenerative logic of the cosmogonic account, when it is replicated at the level of language, endows the speaker with the ability to bring harmony to the realm of human endeavors and to aid in the cultivation of one’s person. The Tai Yi Sheng Shui resorts to the familiar model of sages, and presents them as figures who utilize cosmogonic principles of regeneration and rebirth by appropriately wielding the ‘name’ of dao. In doing so, the text explicitly praises them for achieving the completion of affairs (shi 事) and the cultivation of their persons (shen身).

The Heng Xian seems to offer an alternative account in which the organizing conceptual frame is the ontological division between being or presence (you 有) and non-being or absence (wu 無). ‘Names,’ in this binary account, are endowed with a mediating role between a conscious, coercive activity and a complete absence of the same. The text articulates this middle ground through the creative notion of names and accompanying “endeavors” (shi 事) that “become (or happen) of themselves” (zi wei自為).

This article has offered but one perspective on the treatment of language in classical Chinese texts, foregrounding the intersection of concepts of language and the larger concern with cultivation practices. Numerous possibilities for thinking about the nature of language emerge along a spectrum where speech is rendered, at one end, as a natural disposition, or, at the other, as an artificial construct that must be calibrated to achieve a desired state at the individual as well as communal levels. Irrespective of a bias toward naturalism or constructivism, a recurring theme emerges in the figure of the sage or shengren who supplies each of the schools with a model or fa 法for how language should ideally be deployed. The excavated literature adds additional diversity to this conversation, offering another iteration of the sage who appears to borrow from both the Confucian as well as Daoist theories of language and their corresponding models of sagacity.

9. References and Further Reading

  • Allan, Sarah. 2003. “The Great One, Water, and the Laozi: New Light from Guodian.” T’oung Pao 89 (4/5):237–285.
  • Boltz, William. 1985. “Desultory Notes on Language and Semantics in Ancient China.” Journal of the American Oriental Society 105 (2):309–313.
  • Brindley, Erica F. 2013. “The Cosmos as Creative Mind: Spontaneous Arising, Generating, and Creating in the Heng Xian.” Dao 12 (2):189–206.
  • Fraser, Chris. 2007. “Language and ontology in early Chinese thought.” Philosophy East and West 57 (4):420–456.
  • Fraser, Chris. 2016. The Philosophy of the Mòzĭ: The First Consequentialists: Columbia University Press.
  • Geaney, Jane. 2002. On the Epistemology of the Senses in Early Chinese Thought: University of Hawaii Press.
  • Geaney, Jane. 2010. “Grounding “language” in the senses: What the eyes and ears reveal about Ming 名 (names) in early chinese texts.” Philosophy East and West 60 (2):251–293.
  • Graham, Angus C. 1978. Later Mohist Logic, Ethics, and Science: Chinese University Press.
  • Graham, Angus C. 1989. Disputers of the Tao: Philosophical argument in ancient China: Open Court La Salle, Ill.
  • Hall, David L., and Roger T. Ames. 1987. Thinking Through Confucius: State University of New York Press.
  • Hansen, Chad. 1983. Language and Logic in Ancient China: University of Michigan Press.
  • Harbsmeier, Christoph. 1989a. “The Classical Chinese Modal Particle Yi.” In Proceedings of the Second International Conference on Sinology, 471–503. Academia Sinica.
  • Harbsmeier, Christoph. 1989b. “Marginalia Sino-Logica.” In Understanding the Chinese Mind: The Philosophical Roots, edited by Robert E. Allinson, 59–83. Oxford.
  • Harbsmeier, Christoph. 1991. “The mass noun hypothesis and the part-whole analysis of the White Horse Dialogue.” In Chinese Texts and Philosophical Contexts: Essays Dedicated to Angus C. Graham, 49–66. Open Court.
  • Hutton, E.L. 2014. Xunzi: The Complete Text: Princeton University Press.
  • Kjellberg, Paul. 2007. “Dao and Skepticism.” Dao 6 (3):281–299.
  • Lewis, Mark E. 1999. Writing and Authority in Early China: State University of New York Press.
  • Loy, Hui-chieh. 2003. “Analects 13.3 and the Doctrine of “Correcting Names”.” Monumenta Serica 51:19–36.
  • Mou, Bo. 1999. “The structure of the Chinese language and ontological insights: a collective-noun hypothesis.” Philosophy East and West:45–62.
  • Perkins, F. 2014. Heaven and Earth Are Not Humane: The Problem of Evil in Classical Chinese Philosophy: Indiana University Press.
  • Robins, Dan. 2000. “Mass nouns and count nouns in classical Chinese.” Early China 25:147–184.
  • Wagner, R. G. 2003. Language, Ontology, and Political Philosophy in China: Wang Bi’s Scholarly Exploration of the Dark (Xuanxue): State University of New York Press.
  • Yearley, Lee H. 2005. “Daoist Presentation and Persuasion: Wandering among Zhuangzi’s Kinds of Language.” Journal of Religious Ethics 33 (3):503–535.
  • Zhuangzi. 1956. Zhuangzi Yinde (A Concordance to Chuang Tzu), Harvard-Yenching Institute Sinological Index Series. Cambridge MA: Harvard University Press.

Author Information

Rohan Sikri
Email: rsikri@uga.edu
University of Georgia
U. S. A.

Plato: Meno

PlatoPlato’s Meno introduces aspects of Socratic ethics and Platonic epistemology in a fictional dialogue that is set among important political events and cultural concerns in the last years of Socrates’ life. It begins as an abrupt, prepackaged debater’s challenge from Meno about whether virtue can be taught, and quickly becomes an open and inconclusive search for the essence of this elusive “virtue,” or human goodness in general. This inquiry exhibits typical features of the Socratic method of elenchus, or refutation by cross-examination, and it employs typical criteria for the notoriously difficult goal of Socratic definitions. But then a distinctive objection to the possibility of learning anything at all by such inquiry prompts the introduction of characteristically Platonic themes of immortality, mathematics, and a “recollection” of knowledge not learned by experience in this life. A model geometry lesson with an uneducated slave is supposed to illustrate the importance of being aware of our own ignorance, the nature of proper education, the difference between knowledge and true belief, and the possibility of learning things without being taught. When the conversation returns to Meno’s initial question of whether virtue can be taught, Socrates introduces another manner of investigation, a method of “hypotheses,” by which he argues that virtue must be some kind of knowledge, and so it must be something that’s taught. But then Socrates also argues to the contrary that since virtue is never actually taught, it seems not to be knowledge after all.

This dialogue portrays aspects of Socratic ignorance and Socratic irony while it enacts his twofold mission of exposing common arrogant pretensions and pursuing a philosophical knowledge of virtue that no one ever seems to have. It is pervaded with typical Socratic and Platonic criticisms of how, in spite of people’s constant talk of virtue, they value things like wealth and power more than wisdom and justice. And it includes a tense confrontation with one of the men who will bring Socrates to trial on charges of corrupting young minds with dangerous teachings about morality and religion. The dialogue closes with the surprising suggestion that virtue as practiced in our world both depends on true belief rather than knowledge and is received as some kind of divine gift.

Table of Contents

  1. Overview of the Dialogue
    1. Dramatic Setting
    2. Characters
      1. Socrates
      2. Meno
      3. Anytus
    3. Summary of Arguments, in Three Main Stages
  2. Major Themes of the Dialogue
    1. Virtue and Knowledge
    2. Recollection and Innate Ideas
    3. Teaching and Learning
    4. Theory and Practice
  3. Relations of the Meno to Other Platonic Dialogues
  4. References and Further Reading
    1. The Standard Greek Text
    2. Some English Translations
    3. Some Book-Length Studies
    4. Some Articles and Essays on the Major Themes
      1. Virtue and Knowledge
      2. Recollection and Innate Ideas
      3. Teaching and Learning
      4. Theory and Practice

1. Overview of the Dialogue

a. Dramatic Setting

The Meno is a philosophical fiction, based on real people who took part in important historical events. Plato wrote it probably about 385 B.C.E., and placed it dramatically in 402 B.C.E. Socrates was then about sixty-seven years old, and had long been famous for his difficult questions about virtue and knowledge. In just a few years, he would be convicted and executed for the crime of corrupting the youth of Athens. This dialogue probably takes place in one of Athens’ gymnasia, where men and boys of leisure gathered not just for exercise, but also for education and socializing. Socrates often conducted his distinctive philosophical conversations in places like that, and ambitious young men like Meno, who studied public speaking and the hot intellectual topics of the times, wanted to hear what Socrates had to say. Some wanted to try refuting him in public.

The larger setting is the political and social crisis at the end of the long Peloponnesian War. After finally being defeated by Sparta, Athens has narrowly escaped total destruction, and is now ruled by a Spartan-backed oligarchy. The questions in the Meno about teaching virtue are directly related to longstanding tensions between oligarchic and democratic factions. For generations, Athens had been an intellectual, economic, and military leader, especially after her crucial role—together with Sparta—in repelling the Persian invasions of Greece in 490 B.C.E. and 480 B.C.E. Athens’ radically democratic form of government was distinctive but influential in typically oligarchic Greece, and influential largely because she presided over the Delian League of nearly 200 city-states, which became an Athenian empire. After those Persian invasions, many independent cities had asked Athens to replace Sparta in leading a united defense and reprisal against the Persian empire. But eventually most were just supplying mandated funds to Athens, basically for the continuation of Athens’ war against Sparta’s Peloponnesian League. Through many reversals of fortune, Athens both suffered greatly and flourished culturally, using some of that tribute for her own development and adornment. Much of the best Greek art still familiar to us today—the sculpture and architecture, the tragedy and comedy—comes from the Athens of that time. Artists and intellectuals flocked to Athens, including the new kind of traveling teachers, called “sophists,” who are so disparaged in the last part of the Meno. These teachers were independent entrepreneurs, competing with each other and providing an early form of higher education. Much of their influence came through their expensive courses in public speaking, which in Athens prepared young men of old aristocratic families for success in democratic politics. But various sophists also taught various other subjects, from mathematics to anthropology to literary criticism.

Shortly before this dialogue takes place, some leading Spartans and allies considered killing all the Athenian men and enslaving the women and children. But they decided instead to support a takeover by a brutal, narrow oligarchy, led by thirty members of aristocratic Athenian families who were unhappy with the democracy. Their executions, expropriations, and expulsions earned them the hatred of most Athenians; later “the Thirty” became known as “the Thirty Tyrants.” The extremists among them first purged their more obvious enemies, then turned to the moderates who resisted their cruelty and wanted a broader oligarchy or restricted democracy that included the thousands in the middle class. Thousands of Athenians were killed or fled the city, and many who stayed acquiesced in fear for their lives. But supporters of a return to democracy soon rallied outside the city, defeating the Thirty’s army in May 403 B.C.E. The conversation in the Meno takes place in late January or early February 402 B.C.E. (after Anytus’ return from exile in 403 B.C.E., before Meno’s departure for Persia by early 401 B.C.E., and shortly before annual rites of initiation to the religious Mysteries, which are mentioned at Meno 76e). Democratic and oligarchic factions might then still have been negotiating terms of reconciliation in order to prevent further civil war. The resulting agreement included a general amnesty for crimes committed up to that time, excluding only the Thirty and a few other officials. But the last of the extreme oligarchs would soon massacre the nearby town of Eleusis and take power there, and then attempt another takeover at Athens in 401 B.C.E., before they are finally put down for good.

As Meno and Socrates discuss the nature of virtue and how it might be acquired, the Athenian success story is not over. The democracy would continue for most of the next century, and even a semblance of the empire would be revived. But for now, the recently restored democracy is anxious about continuing class conflict, and fearful of renewed civil war. Some democrats were suspicious of Socrates, and may have believed that he had sided with the extreme oligarchs, because of his prior relationships with some of them. The general amnesty did not allow prosecuting such allegations. But after the war, Socrates continued his uniquely nondemocratic yet anti-elitist, unconventional yet anti-sophistic interrogations. Many Athenians thought that he was undermining traditional morality and piety, and thereby corrupting the young minds of a vulnerable community. Those were the formal charges that led to Socrates’ execution in 399 B.C.E.

b. Characters

i. Socrates

About the historical Socrates, much of what we think we know is drawn from what Plato wrote about him. Socrates published nothing himself, but, probably soon after his death, the Socratic dialogue was born as a new genre of literature. He was portrayed with different emphases by different authors, including Xenophon, Aeschines, Antisthenes, Phaedo, Euclides, and others. But what interests most people about Socrates today comes from Plato’s philosophical portraits. Even these Platonic portraits vary somewhat across his many dialogues, but all are similar in one way or another to what we see in the Meno. Generally, Plato’s Socrates focuses his inquiries on moral subjects, and he will discuss them with anyone who is interested. He claims not to know the answers to his questions, and he interrogates others who do claim to know those answers. He seeks definitions of virtues like courage, moderation, justice, and piety, and often he suggests that each virtue, or virtue as a whole, is really some kind of knowledge.

As Plato depicts Socrates, it was not easy to understand his position in either the politics or the controversial new teachings of the time. Many of his contemporaries, like Meno and Anytus in this dialogue, probably could not distinguish his kinds of questions from other “arts of words” practiced by other intellectuals or “sophists.” But Plato often has Socrates criticizing sophists for claiming to teach more than they knew, and he emphasizes that, by contrast, Socrates never claimed to be a teacher, never accepted fees for his conversations, never sought wealth or political power, and always pursued subjects related to seeking the real nature of virtue.

To make matters more confusing, a few of the Thirty Tyrants or their extremist supporters, like Critias and Charmides, had earlier been associates of Socrates. But again, Socrates’ position in the conflict is not obvious. While he criticized democracy generally for putting power in the hands of an unwise and fickle majority, he never advocated rule by the wealthy either, and certainly not any of the Thirty’s cruel deeds. Plato emphasizes that Socrates respected common citizens more than the famous and powerful (Apology 21b-22e), and that he disobeyed direct orders from the Thirty, at risk to his own life (32cd). Socrates generally advocates humility and justice above all (for example, Apology 20cff, 29dff, Crito 49aff), and he specifically refutes and chastises Charmides and Critias in Plato’s Charmides.

ii. Meno

Meno is apparently visiting the newly restored Athenian government to request aid for his family, one of the ruling aristocracies in Thessaly, in northern Greece, that was currently facing new power struggles there. Meno’s family had previously been such help to Athens against Sparta that his grandfather (also named Meno) was granted Athenian citizenship. We do not know what resulted from Meno’s mission to Athens, but we do know that he soon left Greece to serve as a commander of mercenary troops for Cyrus of Persia—in what turned out to be Cyrus’ attempt to overthrow his brother, King Artaxerxes II.

Meno was young for such a position, about twenty years old, but he was a favorite of the powerful Aristippus, a fellow aristocrat who had borrowed thousands of troops from Cyrus for those power struggles in Thessaly, and was now returning many of them. The contemporary historian Xenophon (who also wrote Socratic dialogues) survived Cyrus’ failed campaign, and he wrote an account whose description of Meno resonates with Plato’s portrait here: ambitious yet lazy for the hard work of doing things properly, and motivated by desire for wealth and power while easily forgetting friendship and justice. But Xenophon paints Meno as a thoroughly selfish and unscrupulous schemer, while Plato sketches him as a potentially dangerous, overly confident young man who has begun to tread the path of arrogance. His natural talents and his privileged but unphilosophical education are not guided by wisdom or even patience, and he prefers “good things” like money over genuine understanding and moral virtue. In this dialogue, Plato imagines Meno encountering Socrates shortly before that disastrous Persian adventure, when he has not yet proved himself to be the “scoundrel” and “tyrant” that Socrates suspects and Xenophon later confirms. According to Xenophon, when Cyrus was killed and his other commanders were quickly beheaded by the King’s men, Meno was separated and tortured at length before being killed, because of his special treachery (see Xenophon’s Anabasis II, 6).

iii. Anytus

Anytus is a prominent Athenian politician and Meno’s host in Athens. He too was wealthy, not in Meno’s old aristocratic way, but as heir to the successful tannery of a self-made businessman. Anytus is passionately opposed to those sophists who thrived in Athens’ democracy and claimed to teach virtue along with so many other things. He prefers the more traditional assumption that good gentlemen learn goodness not from professional teachers but by association with the previous generation of good gentlemen. (That was a traditional aristocratic notion, but it has a democratic shape at Meno 92e, Apology 24d ff., and Protagoras 325c ff.) Although Plato was not a fan of most sophists either, he portrays Anytus’ attitude as clearly prejudicial. And though Socrates is no professional teacher, Anytus considers him just as bad, or worse. Anytus is one of three men who will bring Socrates to trial in 399 B.C.E.

Anytus had himself been prosecuted in 409 B.C.E., for failure as a general in the war against Sparta, and allegedly he escaped punishment by bribing the jury. Later, he supported the moderate faction among the Thirty Tyrants, and was banished by the extremists. Then he was a general for the democratic forces in the fight to overthrow the Thirty in 403 B.C.E., and he quickly became a leading politician in the restored democracy. In the Meno, Socrates presses Anytus about why so many of Athens’ leading statesmen have failed to teach even their own sons to be good, and Anytus could probably see that these questions apply to himself. Xenophon’s Apology of Socrates, which is rather different from Plato’s, suggests that Anytus had a personal grudge against Socrates, since Socrates had criticized Anytus’ education of his own son, and predicted that he would turn out to be no good. But Anytus may well have sincerely believed that Socrates corrupted young men like Critias and Charmides by teaching them to question good traditions. At any rate, Socrates’ questions about education in the Meno upset Anytus enough to warn Socrates to desist, or risk getting hurt—thus foreshadowing Anytus’ role in Socrates’ trial. (Compare Meno 94e f. and 99e f. with Apology 23a-24a and 30cd.)

c. Summary of Arguments, in Three Main Stages

There are three main parts to this dialogue, which are three main stages in the argumentation that leads to the tentative conclusion about how virtue is acquired.

The dialogue opens with Meno’s challenge to Socrates about how “virtue” (aretê) is achieved. Is it something that is taught, or acquired through training, or possessed by nature? Socrates quickly turns the discussion into an investigation of something more basic, namely, what such virtue is. Since Socrates denies knowing the nature of virtue, while Meno confidently claims to know all about it, Socrates gets Meno to try defining it. Most of this third of the dialogue is then an extended series of arguments against Meno’s three attempts to define virtue. We see the famous “Socratic Method,” in which Socrates refutes someone’s claim to knowledge by revealing that one of their claims is contradicted by others that they also believe to be true. For example, Meno’s initial claim that there are irreducibly different virtues for different kinds of people (71e) is incompatible with his implicit belief (elicited by Socrates) that virtues cannot be different insofar as they are virtues. And Meno’s definition of virtue as the ability to rule over others (73d) is incompatible with his agreements that a successful definition of virtue must apply to all cases of virtue (so including those of children and slaves) and only to cases of virtue (so excluding cases of unjust rule). In each case, since Meno accepts these claims that contradict his proposed definitions, he is shown not to know what he thought he knew about virtue. As Socrates three times exposes the inadequacies of Meno’s attempted definitions, giving examples and guidelines for further practice, Meno’s enthusiasm gives way to reluctance and frustration. Eventually, Meno blames Socrates for his trouble, and insults Socrates by comparing him with the ugly, numbing stingray. Then he makes a momentous objection to conducting such an inquiry at all.

The second stage of the dialogue begins with that momentous, twofold objection: if someone does not already know what virtue is, how could he even look for it, and how could he even recognize it if he were to happen upon it? Socrates replies by reformulating that objection as a paradoxical dilemma, then arguing that the dilemma is based on a false dichotomy. The dilemma is that we cannot learn either what we know or what we do not know, because there is no need to learn what we already know, and we cannot recognize what we do not yet know. Socrates tries to expose the false dichotomy by identifying states of cognition between complete knowledge and pure ignorance. First, he introduces a notion that the human soul has learned in previous lives, and suggests that learning is therefore possible by remembering what has been known but forgotten. (Forgotten-but-capable-of-being-remembered is a state of cognition between complete knowledge and pure ignorance.) Then he tries to illustrate this “theory of recollection” with the example of a geometry lesson, in which Socrates refutes a slave’s incorrect answers much as he had refuted Meno, and then leads him to recognize that the correct answer is implied by his own prior true beliefs. (Implicit true belief is another state of cognition between complete knowledge and pure ignorance.) After the geometry lesson, Socrates briefly reinterprets the alleged “recollection” in a way that can be taken as the discovery of some kind of innate knowledge, or innate ideas or beliefs. Meno finds Socrates’ explanation somehow compelling, but puzzling. Socrates says he will not vouch for the details, but recommends it as encouraging us to work hard at learning what we do not now know. He asks Meno to join him again in a search for the definition of virtue.

But in the third stage of the dialogue, Meno nonetheless resists, and asks Socrates instead to answer his initial question: is virtue something that is taught, or is it acquired in some other way? Socrates criticizes Meno for still wanting to know how virtue is acquired without first understanding what it is. But he agrees, reluctantly, to examine whether virtue is something that is taught by way of “hypotheses” about what sorts of things are taught, and about what sorts of things are good. Here Socrates leads Meno to two opposed conclusions. First, he argues, on the hypothesis that virtue is necessarily good, that it must be some kind of knowledge, and therefore must be something that is taught. But then he argues, from the fact that no one does seem to teach virtue, that virtue is not after all something that is taught, and therefore must not be knowledge. This is where Anytus arrives and enters the discussion: he too objects to the sophists who claim to teach virtue for pay, and asserts that any good gentleman can teach young men to be good in the normal course of life. But then Anytus cannot explain Socrates’ long list of counterexamples: famous Athenians who were widely considered virtuous, but who did not teach their virtue even to their own sons. When Anytus withdraws from the conversation in anger, Socrates reminds Meno that sometimes people’s actions are guided not by knowledge but by mere true belief, which has not been “tied down by working out the reason.” He provisionally concludes that when people act virtuously, it is not by knowledge but by true belief, which they receive not by teaching but by some kind of divine gift. But then Socrates warns again that they will not really learn how virtue is acquired until they first figure out what virtue itself is.

2. Major Themes of the Dialogue

a. Virtue and Knowledge

In this whole inconclusive conversation, the most important Socratic proposal is that “virtue” (aretê in Greek) must be some kind of knowledge. But a crucial fact about the dialogue is that this central subject matter, while obviously very important, remains elusive from beginning to end. When Meno asks how aretê is acquired, Socrates denies knowing what aretê really is. Meno thinks he knows what aretê is, but he is soon surprised to find that he cannot define it. As they work at the definition, alleged examples of aretê range from political power to good taste and from justice to getting lots of money. At first, Meno wants to deny that all aretai share some common nature, but he quickly becomes ambivalent about that. Eventually, Socrates seems to persuade him that the essence of aretê must be some kind of knowledge, but then this provisional conclusion gives way under the observation that what they are looking for is apparently never actually taught. In closing, Socrates reminds Meno that their confusion about whether aretê is taught is a result of their confusion about the nature of aretê itself.

So what sort of thing is this aretê that they are trying to understand? Much of ancient Greek literature shows that aretê was a central ideal and basic motivator throughout the culture. The stylized heroes of Homer’s legendary Trojan war and the real soldiers of their own contemporary campaigns, the athletes at the Olympic games and the orators in political debates—all of these, whether they fought for survival or retribution or the common good, were also seeking honor from their peers for aretê. Both the importance and the vagueness of the term is expressed in Socrates’ question to Anytus:

Meno has been telling me for some time, Anytus, that he desires the kind of wisdom and aretê by which people manage their households and cities well, and take care of their parents, and know how to receive and send off fellow-citizes and foreign guests as a good man should. To whom should we send him for this aretê? (91a)

The standard English translations of aretê are “excellence” and “virtue.” “Excellence” reminds us that the ancient concept applies to all of the above and even to some admirable qualities in nonhuman things, like the speed of a good horse, the sharpness of a good knife, and the fertility of good farmland. But “virtue” too is sometimes still used that way, when we speak of the virtues of the plan or the brand that we prefer. And “excellence” is rather weak and abstract for the focus of these Socratic dialogues, which is something people spent a lot of time thinking and worrying about. Intellectuals debated how it is acquired; politicians knew they had to speak persuasively about it; and Socrates himself considered it the most important thing in life. In our dialogue, Meno keeps thinking of aretê in terms of ruling others and acquiring honor or wealth, while Socrates keeps reminding him that aretê must also include things like justice and moderation (73a, d, 78d), industriousness (81d, 86b). and self-control: “rule yourself,” he says, “so that you may be free” (86d). In this connection, it is often said that Greek ethical thinking evolved from a focus on competitive virtues like courage and strength to a greater appreciation of cooperative virtues like justice and fairness. But this could be at most a shift of emphasis, since even Homer’s epics of war and adventure celebrate pity and humility, justice and self-control. So it may help to think of our dialogue as asking how we can acquire “virtue” in the very general sense of human goodness or human greatness. Like Meno, most of us think we already know what “being a good person” or “being a great person” is like, but we would be stumped if we had to define it. The whole range of examples used in this dialogue would be relevant. And Socrates’ basic suggestion, that “being good and great” requires some important kind of knowledge, would seem both attractive and puzzling.

A further reason for the inconclusiveness of the Meno is the inherent difficulty of providing the kind of definition that Socrates seeks. He was notorious for always seeking and always failing to identify the essences of things like justice, piety, courage, and moderation. A successful definition in Socrates’ sense does not just state how a given word is used, or identify examples, or stipulate a special meaning for a given context. A Socratic definition is supposed to reveal the essence of a unitary concept or a type of real thing. Such a definition would specify not just any qualities that are common to that kind of thing, but the qualities that make them be the kind of thing they are. Other characters in Plato’s dialogues usually have difficulty understanding what Socrates is asking for; in fact, the historical Socrates may have been the first person to be rigorous about such definitions. The task is more difficult than it first seems, even for things like shape and color (see 75b-76e); it is even harder to accomplish for something like virtue. The first third of our dialogue takes the time to show that Meno’s list of examples will not do, because it does not reveal what is common to them all and makes them be virtue while other things are not (72a ff.); and that this kind of explanation must apply to all relevant cases (73d) and only to relevant cases (78d-e); and that something cannot be so explained in terms of itself or related terms that are still matters of dispute (79a-e). At the beginning of the dialogue, Meno did not know even how to begin looking for the one essence of all virtue that would enable us to understand things like how it is achieved. Socrates shows him these guidelines, and tries to get him to practice. But while Socrates clearly knows more than Meno about how to investigate the essence of virtue, he has not been able to discover exactly what it is.

Socrates is drawn to the idea that the essence of all virtue is some kind of knowledge. In the last third of the dialogue, when Meno will not try again to define virtue, Socrates introduces and explores his own suspicion in terms of the following “hypothesis”: if virtue is taught then it is knowledge, and if it is knowledge then it is taught, but not otherwise. This line is pursued with the further “firm hypothesis” that virtue must always be a good thing. Socrates argues that only knowledge is necessarily good, and the goodness or badness of everything else depends on whether it is directed by knowledge. The conclusion of this hypothetical investigation would be that virtue is taught because it is some kind of knowledge—and the argument to that effect requires the rejection of Meno’s constant preference for “good things” like wealth and power (78c-d, 87e-89a). But what kind of knowledge? Or what kind of wisdom? In this discussion, Socrates uses a variety of Greek knowledge-terms, combining epistêmê, phronêsis, and nous as if they were interchangeable. The cumulative meaning ranges from knowledge and intelligence to understanding and wisdom. Clearly, what Socrates is looking for would be not just theoretical knowledge but some kind of practical wisdom, a knowledge that can properly direct our behavior and our use of material things. But this dialogue gets no further than arguing that virtue is some sort of wisdom, “in whole or in part” (89a). And then Socrates introduces a reason for reconsidering even that: it seems that such wisdom is never taught.

b. Recollection and Innate Ideas

A surprising interpretation of knowledge occurs in the middle third of the Meno, when Socrates suggests that real learning is a special kind of remembering. Meno’s frustration in trying to define virtue had led him to object:

But in what way will you look for it, Socrates, this thing that you don’t know at all what it is? What sort of thing, among the things you don’t know, will you propose to look for? Or even if you should meet right up against it, how will you know that this is the thing you didn’t know? (80d)

Is Meno here honestly identifying a practical difficulty with this particular kind of inquiry, where the participants now seem not to know even what they are looking for? Or is he just throwing up an abstract, defensive obstacle, so that he does not have to keep trying? Socrates interprets Meno’s objection in the obstructionist way, and reformulates it as a paradoxical theoretical dilemma:

Do you see what a contentious debater’s argument you’re bringing up—that it seems impossible for a person to seek either what he knows or what he doesn’t know? He couldn’t seek what he knows, because he knows it, and there’s no need for him to seek it. Nor could he seek what he doesn’t know, because he doesn’t know what to look for. (80e)

This reformulation of Meno’s objection has come to be known as “Meno’s Paradox.” It is Plato’s first occasion for introducing his notorious “theory of recollection,” which is an early example of what would later be called a theory of innate ideas.

The notion that learning is recollection is supposed to show that learning is possible in spite of Meno’s objection: we can learn by inquiry, because we can begin in a state of neither complete knowledge nor pure ignorance. To understand what Plato intends with his sketchy theory, we should compare the initial statement of the idea (81a-e), the alleged illustration of it (82a-85b), and the restatement of it after the illustration (85b-86b). According to the initial statement, all souls have already learned everything in many former lives, and learning in this life is therefore a matter of remembering what was once known but is now forgotten. But this is apparently an attention-grabber, dubiously citing unnamed priests and poets, who are just the kind of people Socrates later criticizes for having intermittent true beliefs rather than stable knowledge about their subjects (99c-d). Meno is in fact intrigued, and when he asks for a demonstration, Socrates illustrates by cleverly leading an uneducated slave to the correct answer to a geometrical problem—and doing so by “only asking questions” and eliciting the correct answer from the slave himself. Here, Socrates clearly asks “leading questions,” and eventually even shows the slave the answer in the form of a question (84e). But more important is the fact that he legitimately helps the slave to work out the reasoning, and thereby see the way in which the unexpected answer was implied by other true beliefs that he already had. So the geometry lesson successfully demonstrates some of the beauty of Socratic education, and the power of deductive reasoning in learning. That is enough to refute Meno’s Paradox, which inferred the impossibility of learning from a false dichotomy between complete knowledge and pure ignorance.

But the geometry lesson with the slave clearly does not demonstrate the reminding of something that was learned in a previous life. So it is important to notice that Socrates partly restates the “theory of recollection” after the geometry lesson. This time he concludes not that the slave has remembered some geometrical knowledge from what his mind had learned from experiences in previous lives, but instead that the slave has discovered the relevant true beliefs in his mind, which is somehow “always in a state of having learned” (86a). In the context, that “always” does seem to include many lifetimes, though it could in principle refer just to however long the mind has existed, perhaps since some point of development in the womb. In any case, the phrase “always in a state of having learned” is unusual and striking. If a mind could always be in a state of having learned something, then there would be no point at which it learned that thing. This paradoxical phrasing turns the initial statement of the theory of recollection, which stretched a common-sense notion of learning from experience over a number of successive lifetimes, into the beginnings of a theory of innate ideas, because the geometrical beliefs or concepts somehow belong to the mind at all times. Near this point in the dialogue, Socrates also states that after employing such ideas to elicit the relevant true beliefs, more work is still required for converting them to knowledge (85c-d). Later in the conversation, Socrates even seems to identify “recollection” with this latter part of the process (98a).

Some philosophers and experimental psychologists today agree that basic mathematical concepts, and the beliefs implicit in them (along with many others), are innate—not as an eternal possession of an immortal soul, but as a universal and specialized human capacity determined in part by biological evolution. So in a sense, Socrates’ conclusion that something of “the truth about reality” is “always in our minds” (86b) is even roughly compatible with modern science. The Meno does not end up specifying just what kind of innate resources enable genuine learning about geometry or virtue: Socrates infers from the geometry lesson both that the slave had innate knowledge (85d), and that he had innate beliefs that can be converted to knowledge (85c, 86a), but the dialogue ends with an agreement that “men have neither of these by nature, neither knowledge nor true belief” (98c-d). In fact, while Plato seems quite serious about the idea that genuine learning requires discovering knowledge for ourselves on the basis of our innate resources, he has Socrates disclaim confidence about any details of the theory in this dialogue (86b-c).

c. Teaching and Learning

According to Socrates, the practical purpose of the theory of recollection is to make Meno eager to learn without a teacher (81e-82a, 86b-c). It seems that Meno is used to thinking of learning as just hearing and remembering what others say, and he objects to continuing the inquiry into the nature of virtue with Socrates precisely because neither of them already knows what it is (80d). The geometry lesson shows that we can learn things we do not yet know (at least what we do not yet consciously and explicitly know) if they are entailed by other things that we know or correctly believe. And Socrates emphatically alleges that when the slave becomes aware of his own ignorance, he properly desires to overcome it by learning; this too is supposed to be an object lesson for Meno (84a-d). But Meno does not learn this lesson. Instead of desiring to inquire into the real nature of virtue, he asks instead to hear Socrates’ answer to his initial question about how virtue is acquired. He asks again whether virtue is something that is taught, and once again he wants to be taught about this just by being told (86c-d; compare 70a, 75b, 76a-b, 76d).

This time Socrates apparently relents, but he warns that the rest of their discussion will be compromised by a flawed approach. At least he gets Meno to follow him in a self-consciously “hypothetical” approach—a kind of method that he claims to borrow from mathematicians, who use it when they cannot prove more securely what they want to prove. He illustrates with a geometrical hypothesis that is notoriously obscure, but the corresponding hypothesis about virtue seems to be this: if virtue is something that is taught, then it is a kind of knowledge, and if it is a kind of knowledge, then it is something that is taught (87b-c). Next, Socrates offers an independent argument (based on a different hypothesis) that virtue must in fact be some kind of knowledge, because virtue is necessarily good and beneficial, and only knowledge could be necessarily good and beneficial. Together with the hypothesis that knowledge and only knowledge is taught, Socrates would have proved that virtue is something that is taught.

But there is something wrong with the hypothesis that all and only knowledge is taught. Surely much of what is taught is just opinion, and surely some knowledge is learned on one’s own, without a teacher. In fact, one main point of the theory of recollection and the geometry lesson was that real learning requires active inquiry and discovery from one’s own resources, which include some form of innate knowledge. Even if Socrates did “teach” the geometry lesson in a Socratic way, by leading the slave to the answer with the right questions, nonetheless he showed that while he could in some sense just show the slave the answer, he could not successfully give him knowledge or understanding. That requires working out the explanation for oneself (82d, 83d, 84b-c, 85c-d; compare 98a). This whole lesson was conducted in order to encourage Meno to try learning what virtue is, when he does not have a teacher to tell him what it is (81e-82a, 86c).

So why would Socrates use the faulty hypothesis that knowledge and only knowledge is taught, when it contradicts his notion of recollection and his model geometry lesson? Perhaps because, in effect, it is really Meno’s own hypothesis, as his opening questions and his behavior throughout the dialogue persistently imply. Meno’s opening set of questions substitutes “learned” for “taught” as if they were the same thing (Is virtue taught? Or is it trained? Or is it neither learned nor trained…). And then he just wants to hear Socrates’ answers, and keeps resisting the hard work of definition that Socrates keeps encouraging. When Meno resists yet again after the theory of recollection and the geometry lesson (86c), Socrates cleverly investigates this hypothesis, implicit in Meno’s behavior, to redirect Meno’s attention from his question about how virtue is acquired (Is it taught?) back to the unanswered question of what virtue is (Is it knowledge?). So Socrates could be quite serious in his lengthy argument that virtue must be some kind of knowledge (87c-89a), while reluctantly making use of the unsupported hypothesis that knowledge must be taught because, in effect, Meno insists upon it. Meno refuses to pursue knowledge of virtue the hard way, and he thinks that what he hears about virtue the easy way is knowledge.

After persuading Meno to take seriously his own favorite notion—that virtue is achieved through some kind of knowledge, rather than through wealth and political power—Socrates endeavors to convince Meno that learning just by hearing from others does not provide real knowledge or real virtue. Meno’s host Anytus now arrives at just the right moment, since Anytus is passionately opposed to the sophists who claim to teach wisdom and virtue with their traveling lectures and verbal displays. Anytus believes that virtue can be learned instead by spending time with any good gentleman of Athens, but Socrates shows that this view is superficial, too. He gathers well-known examples of allegedly virtuous men who did not teach their virtue even to their own children, which indicates that virtue is not something that is taught. Anytus departs in annoyance at Socrates’ seemingly dismissive treatment of Athens’ political heroes, so Socrates continues the issue with Meno. He reminds Meno that even professional teachers and good men themselves disagree about whether virtue can be taught. The closing pages argue that if their earlier hypothesis was true, and “people are taught nothing but knowledge,” then since virtue is not taught, virtue would not be knowledge. Socrates suggests that perhaps it could be correct belief instead. Correct belief can direct our behavior well, too, though not nearly as reliably as knowledge.

In this final portion of the dialogue, Socrates twice again asks Meno whether “if there are no teachers, there are no learners.” And Meno keeps affirming it, though no longer with full confidence: “I think … So it seems … if we have examined this correctly” (96c-d). Meno’s challenge to Socrates in the opening lines of the dialogue had used the terms “learned” and “taught” interchangeably. In the meantime, Socrates’ notion of learning as “recollection” indicates that knowledge requires much more than verbal instruction. As Socrates says to Anytus:

For some time we have been examining … whether virtue is something that’s taught. To that end we are asking whether good men past or present know how to bestow on another this virtue which makes them good, or whether it just isn’t something a man can give or receive from another. (93a-b)

Meno’s assumption that knowledge must be taught, and taught by mere verbal instruction, prevents a fuller investigation in this dialogue of Socrates’ hope that virtue is a kind of knowledge.

d. Theory and Practice

And what about Socrates: does he teach virtue in the Meno? He offers a theory that “there is no teaching but recollection” (82a). But what about his practice? Isn’t Socrates trying to teach Meno, by leading him to a correct definition of virtue, as he led Meno’s slave to the correct answer in the geometry lesson?

Rather, Socrates’ practice in the geometry lesson actually goes pretty well with his theory that there is no teaching, because his leading questions there require that the slave think through the deduction of the answer from what he already knew. And Socrates finishes by emphasizing that real knowledge of the answer requires working out the explanation for oneself. So even if a “teacher” can show the answer, he cannot give the understanding. The understanding requires active inquiry and discovery for oneself, based on innate mental resources and a genuine desire to learn. Whatever else might prove true or false about the notion that learning is a kind of recollection, these practical implications are what Socrates insists upon.

On behalf of the rest of the theory, I wouldn’t much insist. But we’ll be better men, braver and less lazy, if we believe that we must search for the things we don’t know, rather than if we believe that it’s not possible to find out what we don’t know, and that we must not search for it—this I would fight for very much, so long as I’m able, both in theory and in practice. (86b-c)

The practical side of learning as recollection applies no less in Socrates’ interactions with Meno. Socrates tries leading Meno to desire real knowledge of what virtue is rather than just collecting others’ opinions about how it is acquired, and tries to get him to practice active inquiry and discovery of the truth for himself, starting from his own basic and sincere beliefs about virtue. Meno’s moral education would call for all of that even if Socrates could tell him what the essence of virtue is, which he claims he cannot do.

Active Socratic inquiry requires humble hard work on the part of all learners: practice in the sense of the personal effort and training that properly develops natural ability. Socrates’ efforts to guide Meno throughout the dialogue indicate that achieving the wisdom that is virtue would require both the right kind of natural abilities and the right kind of training or practice—so that teaching can help if it is not mere verbal instruction but discussions that help a learner to discover the knowledge for himself. That could be the whole dialogue’s answer to Meno’s opening challenge, which specifies three options:

Tell me if you can, Socrates: Is virtue something that’s taught? Or is it not taught, but trained? Or is it neither trained nor learned, but people get it by nature, or in some other way? (70a)

Some have argued that Plato mentions training in the opening lines only because it was one of the traditional options debated in his day. It seems to be tacitly dropped from the rest of the dialogue, and when Meno later revisits his opening challenge, he omits the option about training (86c-d). But if Meno forgets or deliberately avoids it, Socrates does not. When Meno starts to recognize his difficulties, Socrates encourages him to practice with definitions about shape (75a) and gives him a series of paradigms or examples to practice with (73e-77a); later, he criticizes Meno for refusing to do so (79a). At a number of points, Socrates draws attention to the kind of training and habits Meno has already received (70b, 76d, 82a). The geometry lesson, which is supposed to exhibit successful persistent inquiry in the face of previous failures, concludes with advice about the need to work through problems “many times in many ways” (85c) and with a repeated warning about intellectual laziness (86b). While the theory that learning is recollection suggests that an essential basis for wisdom and virtue is innate, Socrates also reminds Meno that any such basis in nature would still require development through experience (89b). When Anytus enters the discussion, his father is praised as a man who, unlike Anytus himself, did not receive his prosperity as a gift from his father, but earned it “by his own skill and hard work” (90a). And the combination of quotations from Theognis near the end of the dialogue suggest that virtue is learned not through verbal teaching alone, but through some kind of character-apprenticeship under the guidance of others who are already accomplished in virtue (95d ff.)

Socrates’ persistence in encouraging Meno to practice active inquiry points in the same direction as the sketchy theory of recollection: while the kind of wisdom that could be real virtue would require understanding the nature of virtue itself, it would not be achieved by being told the definition. And it would not be a theoretical understanding divorced from the practice of virtue. In fact, our dialogue as a whole shows that Meno will not acquire the wisdom that is virtue until after he already practices some measure of virtue: at least the kind of humility, courage, and industriousness that are necessary for genuine learning.

3. Relations of the Meno to Other Platonic Dialogues

We cannot be precise or certain about much in Plato’s writing career. The Meno seems to be philosophically transitional between rough groupings of dialogues that are often associated in allegedly chronological terms, though these groupings have been qualified and questioned in various ways. It is commonly thought that in the Meno we see Plato transitioning from (a) a presumably earlier group of especially “Socratic” dialogues, which defend Socrates’ ways of refuting unwarranted claims to knowledge and promoting intellectual humility, and so are largely inconclusive concerning virtue and knowledge, to (b) a presumably “middle” group of more constructively theoretical dialogues, which involve Plato’s famous metaphysics and epistemology of transcendent “Forms,” such Justice itself, Equality itself, and Beauty or Goodness itself. (However, that second group of dialogues remains rather tentative and exploratory in its theories, and there is also (c) a presumably “late” group of dialogues that seems critical of the middle-period metaphysics, adopting somewhat different logical and linguistic methods in treating similar philosophical issues.) So the Meno begins with a typically unsuccessful Socratic search for a definition, providing some lessons about good definitions and exposing someone’s arrogance in thinking that he knows much more than he really knows. All of that resembles what we see in early dialogues like the Euthyphro, Laches, Charmides, and Lysis. But the style and substance of the Meno changes somewhat with the formulation of Meno’s Paradox about the possibility of learning anything with such inquiries, which prompts Socrates to introduce the notions that the human soul is immortal, that genuine learning requires some form of innate knowledge, and that progress can be made with a kind of hypothetical method that is related to mathematical sciences. This cluster of Platonic concerns is variously developed in the Phaedo, Symposium, Republic, and Phaedrus, but in those dialogues, these concerns are combined with arguments concerning imperceptible, immaterial Forms, which are never mentioned in the Meno. Accordingly, many scholars believe that the Meno was written between those groups of dialogues, and probably about 385 B.C.E. That would be about seventeen years after the dramatic date of the dialogue, about fourteen years after the trial and execution of Socrates, and about the time that Plato founded his own school at the gymnasium called the Academy.

More specifically, significant relations of the Meno to other Platonic dialogues include the following.

The Meno is related by its dramatic setting to the famous series of dialogues that center on the historical indictment, trial, imprisonment, and death of Socrates (Euthyphro, Apology, Crito, and Phaedo). Anytus in the Meno will be one of the three men who prosecute Socrates, which is specifically foreshadowed in the Meno at 94e.

The failed attempt to define virtue as a whole in the Meno is much like the failed attempts in other dialogues to define particular virtues: piety in the Euthyphro, courage in the Laches, moderation in the Charmides, and justice in the first book of the Republic. (And two other dialogues attempt and fail to define terms that are related to virtue: friendship in the Lysis and beautiful/good/fine (to kalon) in the Hippias Major.) Those dialogues emphasize some of the same criteria for successful definitions as the Meno, including that it must apply to all and only relevant cases, and that it must identify the nature or essence of what is being defined. The Meno adds another criterion: that something may not be defined in terms of itself, or in related terms that are still subject to dispute.

One of Socrates’ arguments late in the Meno, that virtue probably cannot be taught because men who are widely considered virtuous have not taught it even to their own sons, is also used near the beginning of Plato’s Protagoras. But there it is countered by a long explanation from the sophist Protagoras of how virtue is in fact taught to everyone by everyone, not with definitions or by mere verbal instruction, but in a life-long training of human nature through imitation, storytelling, and rewards and punishments of many kinds. Socrates does not object to this theory of moral education (instead he objects to other parts of Protagoras’ account), and elements of it are included in the system of education outlined by Socrates in Plato’s Republic. But while Plato’s treatment of Protagoras’ theory of education in the Protagoras is fairly sympathetic, the Meno’s general disparagement of sophistic teaching is explored at length in Socrates’ debates with individual sophists in Plato’s Euthydemus, Gorgias, Hippias Minor, and Hippias Major.

The Meno’s geometry lesson with the slave, where success in learning some geometry is supposed to encourage serious inquiry about virtue, is one indication of Plato’s interest in relations between mathematical and moral education. In the Gorgias (named after a sophist or orator who is mentioned early in the Meno as one of Meno’s teachers), Socrates debates an ambitious young orator-politician who is drawn to a crass hedonism, and claims that his soul lacks good order because he neglects geometry, and so does not appreciate the ratios or proportions exhibited in the good order of nature. Book VII of the Republic describes a system of higher education designed for ideal rulers, which uses a graduated series of mathematical studies to prepare such rulers for philosophical dialectic and for eventually understanding the Form of Goodness itself. In this connection, Socrates’ introduction of a “hypothetical” method of inquiry, adopted from mathematics, is developed somewhat in the Phaedo and in Republic Book VI.

The notion of learning as recollection is revisited most conspicuously in Plato’s Phaedo (72e-76e) and Phaedrus (246a ff.), both of which associate it closely with theories of human immortality and eternal, transcendent Forms. The passage about recollection in the Phaedo even begins by alluding to the one in the Meno, but then it discusses recollection not of specific beliefs or propositions (like the theorem about doubling the square in the Meno), but of basic general concepts like Equality and Beauty, which Socrates argues cannot be learned from our experiences in this life. In the Phaedrus, recollection of such Forms is not argued for but asserted, in a rather suggestive and playful manner, as part of a myth-based story about the human soul’s journeys with gods, which is meant to convey the power of love in philosophical learning. Plato also explores other models of innate knowledge elsewhere, such as an innate mental pregnancy in the Symposium (206c-212b; compare Phaedrus 251a ff.) and an innate intellectual vision in the Republic (507a-509c, 518b ff.).

4. References and Further Reading

a. The Standard Greek Text

  • Burnet, John. Platonis Opera, vol. III. Oxford: Clarendon Press, 1903.

b. Some English Translations

  • Plato: Meno. Translated by G. M. A. Grube. Second Edition. Hackett Publishing, 1980.
  • Plato: Meno and Phaedo. Translated by Alex Long and David Sedley. Cambridge Texts in the History of Philosophy. Cambridge University Press, 2011.
  • Plato: Protagoras and Meno. Translated by Adam Beresford and introduced by Lesley Brown. Penguin Classics, 2006.

c. Some Book-Length Studies

  • Bluck, R. S. Plato’s Meno, Edited with Introduction and Commentary. Cambridge University Press, 1961.
  • Klein, Jacob. A Commentary on Plato’s Meno. University of North Carolina Press, 1965.
  • Scott, Dominic. Plato’s Meno. Cambridge University Press, 2006.
  • Sharples, R. W. Plato’s Meno, Edited with Translation and Notes. Chicago: Bolchazy-Carducci, 1984.
  • Weiss, Roslyn. Virtue in the Cave: Moral Inquiry in Plato’s Meno. Oxford University Press, 2001.

d. Some Articles and Essays on the Major Themes

i. Virtue and Knowledge

  • Fine, Gail. “Inquiry in the Meno.” In The Cambridge Companion to Plato, edited by Richard Kraut, 200-226. Cambridge University Press, 1992.
  • Brickhouse, Thomas C., and Nicholas D. Smith. “Socrates and the Unity of the Virtues.” The Journal of Ethics 1 (1996): 311-324.
  • Santas, Gerasimos. “Socratic Definitions.” In Gerasimos Santas, Socrates: Philosophy in Plato’s Early Dialogues, 97-135. Routledge and Kegan Paul, 1979.
  • Vlastos, Gregory. “The Socratic Elenchus: Method Is All.” In Socratic Studies, edited by Gregory Vlastos, 1-37. Cambridge University Press, 1994.
  • Woodruff, Paul. “Plato’s Earlier Theory of Knowledge.” In Essays on the Philosophy of Socrates, edited by Hugh Benson, 86-106. Oxford University Press, 1992.

ii. Recollection and Innate Ideas

  • Moravcsik, Julius. “Learning as Recollection.” In Plato I: Metaphysics and Epistemology, edited by Gregory Vlastos, 53-69. Anchor Books, 1971.
  • Rawson, Glenn. “Platonic Recollection and Mental Pregnancy.” Journal of the History of Philosophy 44 (2006): 137-155.
  • Vlastos, Gregory. “Anamnesis in the Meno.” Dialogue IV (1965): 143-167.

iii. Teaching and Learning

  • Devereaux, Daniel T. “Nature and Teaching in Plato’s Meno.” Phronesis 32 (1978): 118-126.
  • Scolnicov, Samuel. “Three Aspects of Plato’s Philosophy of Learning and Instruction.” Paideia Special Plato Issue (1976): 50-62.
  • Woodruff, Paul. “Socratic Education.” In Philosophers on Education, edited by Amelie Rorty, 13-29. Routledge, 1998.

iv. Theory and Practice

  • Nehamas, Alexander. “Meno’s Paradox and Socrates as a Teacher.” In Essays on the Philosophy of Socrates, edited by Hugh Benson. Oxford University Press, 1992.
  • Rawson, Glenn. “Speculative Theory, Practical Theory, and Practice in Plato’s Meno.” Southwest Philosophy Review 17 (January 2001): 103-112.

Author Information

Glenn Rawson
Email: grawson@ric.edu
Rhode Island College
U. S. A.

Liar Paradox

The Liar Paradox is an argument that arrives at a contradiction by reasoning about a Liar Sentence. The Classical Liar Sentence is the self-referential sentence:

This sentence is false.

It leads to the same difficulties as the sentence, I am lying. Experts in the field of philosophical logic have never agreed on the way out of the trouble despite 2,300 years of attention. Here is the trouble. It is a sketch of the Paradox, the argument that reveals the contradiction:

Let L be the Classical Liar Sentence. If L is true, then L is false. But the converse also can be established, as follows. Assume L is false. Because the Liar Sentence is just the sentence ‘L is false’, the Liar Sentence is therefore true, so L is true. What has now been shown is that L is true if, and only if, it is false. Since L must be one or the other, it is both.

That contradictory result apparently throws us into the lion’s den of semantic incoherence. The incoherence is due to the fact that, according to the rules of classical logic, anything follows from a contradiction, even 1 + 1 = 3. This article explores the details and implications of the principal ways out of the Paradox, that is, the ways of preserving or restoring semantic coherence.

Most people, when first encountering the Liar Paradox, react in one of two ways. One reaction is not to take the Paradox seriously and say they will not reason any more about it. The second and more popular reaction is to say the Liar Sentence must be meaningless. The first reaction—not taking it seriously—provides no useful diagnosis of the original problem of semantic incoherence. The second is not an adequate solution if it can answer the question, Why is the Classical Liar Sentence meaningless? only with the ad hoc remark “Otherwise we get a paradox.” An adequate solution should offer a more systematic treatment. For example, the sentence ‘This sentence is not in Italian’ is very similar to the Classical Liar Sentence. Is it meaningless, too? Apparently not. So, what feature of the Liar Sentence makes it be meaningless while ‘This sentence is not in Italian’ is not meaningless?

Is the Liar Paradox importantly different if one considers it to be about statements or propositions rather than sentences? The classical view of propositions is that a proposition is what a person uses a sentence to say, and that a proposition has its truth value independently of the sentence used to express it. So, one issue is whether it is important to start the Liar Paradox argument with this liar sentence:

What this sentence says is false.

instead of this one:

This sentence is false.

The questions about the Liar Paradox continue, and an adequate solution should address the questions formally or at least systematically.

Table of Contents

  1. History of the Paradox
    1. Strengthened Liar Paradox
    2. Why the Paradox is a Serious Problem
    3. Tarski’s Undefinability Theorem
  2. Overview of Ways Out of the Paradox
    1. Five Ways Out
    2. Sentences, Statements, and Propositions
    3. An Ideal Solution to the Paradox
    4. Should Classical Logic be Revised?
  3. Assessing the Five Ways Out
    1. Russell’s Type Theory
    2. Tarski’s Hierarchy of Meta-Languages
    3. Kripke’s Hierarchy of Interpretations
    4. Barwise and Etchemendy
    5. Paraconsistency
  4. Conclusion
  5. References and Further Reading

1. History of the Paradox

Zeno’s Paradoxes were discovered in the 5th century B.C.E., and the Liar Paradox was discovered later in the middle of the 4th century B.C.E. Both were discovered in ancient Greece. The oldest attribution of the Liar Paradox is to Eubulides of Miletus, a contemporary of Socrates, who included it among a list of seven puzzles. He said, “A man says that he is lying. Is what he says true or false?” Eubulides’ actual commentary on the Liar has not been found. An ancient gravestone on the Greek Island of Kos was reported by Athenaeus to contain this poem which might be about the difficulty of solving the Paradox:

O Stranger: Philetas of Kos am I,

‘Twas the Liar who made me die,

And the bad nights caused thereby.

Aristotle first clearly described the principle that no sentence can be contradictory; see his Metaphysics Book IV, Chapter 3, 1005b lines 6-34. Theophrastus, Aristotle’s successor, wrote three papyrus rolls about the Liar Paradox, and the Stoic philosopher Chrysippus wrote six, but their contents are lost in the sands of time. Despite various comments on how to solve the Paradox, no Greek suggested that the Greek language itself was inconsistent; it was the reasoning within Greek that was considered to be inconsistent.

In the eleventh century, St. Peter Damian of Italy asserted that even an omnipotent God could not make a contradiction be true.

In the Late Medieval period in Europe, the French philosopher Jean Buridan put the Liar Paradox to devious use with the following proof of the existence of God. It uses the pair of sentences:

God exists.

None of the sentences in this pair is true.

The only consistent way to assign truth values (being true or being false) requires making the sentence God exists be true. In this way, Buridan has apparently proved that God does exist.

There are many other versions of the Paradox. Some Liar Paradoxes begin with a chain of sentences, no one of which is self-referential, although the chain as a whole is self-referential or circular:

The following sentence is true.

The following sentence is true.

The following sentence is true.

The first sentence in this list is false.

There are also Contingent Liars which may or may not lead to a paradox depending on what happens in the world beyond the sentence. For example:

It is raining, and this sentence is false.

Paradoxicality here depends on the weather. If it is sunny, then the sentence is simply false, but if it is raining, then we have the beginning of a paradox.

Suppose we try to solve the paradox by saying the Classical Liar Sentence, namely L, is so odd that it is neither true nor false. This way out fails for the following reason. If L were to be neither true nor false, as this treatment is suggesting, then, by the meaning of neither…nor, L is not false. But that consequence implies that what L says of itself (namely, that it is false) is false. So, L is false. This result leaves us with a contradiction (that L is false and not false). Unless there is a mistake in this reasoning, taking the route of saying the Liar Sentence is neither true nor false is not a successful treatment.

a. Strengthened Liar Paradox

Suppose we were somehow to have found a promising way out of the Classical Liar Paradox. Ahead looms the Strengthened Liar Paradox. The Strengthened Liar Paradox is called Strengthened because some promising solutions to the Classical Liar Paradox fail when faced with the Strengthened Liar Paradox.

The Strengthened Liar Paradox (also called the Strong Liar Paradox) can begin with a Strengthened Liar Sentence such as:

This sentence is not true,

to produce a contradiction. For example, let us stipulate that L’ is a name of the Strengthened Liar Sentence, and let us stipulate that the phrase This sentence within L’ refers to the full sentence L’. Surely L’ is either true or not true. Let’s examine both cases, or both disjuncts, starting with the second disjunct. Suppose L’ is not true. If L’ is not true, then that apparently implies it is true since any speaker who expresses the sentence is saying it is not true. Having established this result, now let’s make a different supposition starting with the first disjunct. Suppose L’ is true. If L’ were true, then that implies, just from the meaning of the sentence, that it is not true. That is our second result. Now, let’s combine the two results and we have established that L’ is true if and only if it is not true. Now we have a paradox because L’ is true or it is not.

Here is another version of the Strengthened Liar Paradox. Suppose you believe a promising way to solve the Classical Liar Paradox is to call the Classical Liar Sentence meaningless, with the assumption that any declarative sentence is true, false or meaningless. Before you can be content with that treatment, you must consider that it is not meaningless to call a sentence meaningless. If the Classical Liar Paradox is apparently solved formally by having an object language that allows a truth predicate and a falsehood predicate and a predicate that applies to meaningless phrases, then one could form in the object language a different Strengthened Liar Sentence, call it L”, that informally says:

This sentence is either false or meaningless.

Now we are on the road to paradox again. Surely L” is either true or it is not. Let us examine both disjuncts. (1) Suppose L” were true. If L” is true, then it is false or meaningless. If so, then it is not true. (2) Now for the second disjunct. Suppose L” were not true. Why would a declarative sentence not be true? Because it is false or meaningless. But the sentence’s being false or meaningless is precisely the claim being made by speakers of L”, so it follows that L” is true. So, by combining the results from both (1) and (2), one may conclude that L” is true if and only if it is not. We have a contradiction. It is understandable why this reasoning is often called the revenge of the liar.

We do not want to solve the Classical Liar Paradox only to be ensnared by the Strengthened Liar Paradox. Therefore, finding one’s way out of the Strengthened Liar Paradox is the acid test of a successful solution.

In discussions below, where context does not disambiguate between the Classical Liar Paradox and the Strengthened Liar Paradox and where it is not important to distinguish them, the simple phrase the Liar Paradox is used.

b. Why the Paradox is a Serious Problem

To put the Liar Paradox in perspective, it is essential to appreciate why such an apparently trivial problem is a deep problem. Solving the Liar Paradox is part of the larger project of understanding truth. Understanding truth is a difficult project that involves finding a theory of truth, or a definition of truth, and a proper analysis of the concept of truth. These are distinct projects, but the current article does not carefully distinguish them from each other.

Some researchers believe the Liar Paradox is one of several unresolvable knots in our language that “do exist and are not merely the product of careless and confused reasoning” (Mates 1981, 3). One of the aims of this article is to assess this claim.

Before saying more about the paradox and about a theory of truth, let us be clear about what a contradiction is. When this article speaks of a contradiction in a sentence that is being or can be asserted, it means a sentence that is equivalent to a compound sentence that has the logical form of an assertion and its denial. Slightly more formally, the logical form of a contradiction is P and Not P, where P is some declarative sentence or independent clause, and Not P is its negation. When a Marxist speaks of the contradiction in capitalism, the Marxist is not referring to a contradiction in the sense of that term that is of interest to this article, but rather to the fact that opposing social forces will clash and produce a restructuring of the society’s economic system.

Languages are expected to contain contradictions but not paradoxes. The contradictory sentence such as Snow is white, and snow is not white, is just one of the many false sentences in the English language. But languages are not expected to contain or permit paradoxes, namely an apparently good inference in support of a contradiction. At least not in the philosopher’s sense of that word. Informally, many speakers will sometimes say of any very surprising or puzzling chain of reasoning that it is a paradox, for example the Twin Paradox of Einstein’s Theory of Relativity, but this is not the sense of the word paradox used in this article. A paradox in our sense is an apparently convincing argument leading from apparently true premises to a contradictory conclusion of the logical form P and Not P.

Why is that conclusion a problem? Well, let L be the Liar sentence, and let our contradictory conclusion be that L is both true and false. Calling a sentence false is apparently equivalent to calling its negation true. So, if ~L is the formal representation of the negation of L, and if we accept the conclusion of the Liar Paradox, then the compound sentence L and ~L is true. Now the trouble begins. Let Q be some sentence we already know not to be true, say 1 + 1 = 3. Then we can reason this way:

1. L and ~L from the Liar Paradox
2. L from 1
3. L or Q from 2 using the Law of Addition
4. ~L from 1
5. Q from 3 and 4

This apparently legitimate proof that 1 + 1 = 3 is outrageous. That is why the paradox is a serious problem. An appropriate reaction to any paradox is to look for some unacceptable assumption made in the apparently convincing argument or else to look for a faulty step in the reasoning. Only very reluctantly would one want to learn to live with the contradiction being true, or ignore the contradiction altogether. The very existence of the Liar Paradox and other semantic paradoxes is evidence that there are principles we use which we have been taking to be obviously valid or obviously correct but which are not.

By the way, what this article calls paradoxes are called antinomies by Quine, Tarski, and some other authors.

Let us return to the issue of understanding truth by finding a theory of truth. We naturally want our theory of truth not to allow paradoxes. Aristotle offered what most philosophers consider to be a correct, necessary condition for any adequate theory of truth. Stripped of his overtones suggesting a correspondence theory of truth, Aristotle proposed (in Metaphysics 1011 b26) what is now called a precursor to Alfred Tarski’s Convention T (or his T-scheme):

A sentence is true if, and only if, what it says is so.

In his 1933 article, “The Concept of Truth in Formalized Languages,” Tarski rephrased the idea this way:

A true sentence is one which says that the state of affairs is so and so, and the state of affairs indeed is so and so.

Before we say more about the trouble with our theories of truth and reference, it will be helpful to describe the use-mention distinction. This is the distinction between using a term and mentioning it. Let us not confuse a dog with its name. Lassie is a helpful dog, but the word Lassie is not a dog at all; it is a six letter word. Placing pairs of quotation marks around a term, or italicizing it, serves to name it or mention it. The use-mention distinction applies to sentences as well as terms.

Tarski’s Convention T says a formally correct truth-definition should logically imply, all sentences that say, for example: the sentence Snow is white is true just in case snow is white. Here is a second example of the form of the sentences Tarski is aiming at:

The sentence Aristotle was a student of Plato is true just in case Aristotle was a student of Plato.

If the same sentence about snow were named or mentioned not with italics or quotation marks but with the numeral 88 inside a pair of parentheses, then (88) would be true just in case snow is white. There is still another way to refer to sentences, namely via self-reference. If I say, “This sentence is written in English, and not Italian,” then the phrase This sentence refers to that sentence. This is all straightforward, and is a well-accepted way of doing naming and referring.

There is another important point to make about the use of quotation marks.  When a logician says

For any sentence S, if “S” is true, then S,

this is not a remark about the letter of the alphabet between “R” and “T”. It is a remark about sentences.

Finally, let us be clearer about substitution of names. If we have two names with the same denotation, then usually one name can be substituted for the other in a sentence without the newly-produced sentence changing its truth-value. Mark Twain is the same person as Samuel Clemens, so substituting ‘Samuel Clemens’ for ‘Mark Twain’ in the true sentence:

Mark Twain was not a famous 21st century U.S. president

will produce:

Samuel Clemens was not a famous 21st century U.S. president

which is also true. The substitution preserves truth. At least it does here, but it does not in some other contexts. There are well known exceptions to this substitution principle. For example, suppose this is true:

John said, “Mark Twain was not a famous 21st century U.S. president.”

If John said nothing about Samuel Clemens, then the above substitution would turn a true sentence into a false one. So, in substituting we need to be careful about substituting inside a quoted phrase.

All these remarks about truth, reference, and substitution seem to be straightforward and not troublesome. Unfortunately, together they do lead to trouble, and the resolution of the difficulty is still an open problem in philosophical logic. Why is that? The brief answer is that Tarski’s sentence with the supposedly uncontroversial assumptions above can be used to produce the Liar Paradox. The less brief answer refers to Tarski’s Undefinability Theorem of 1936.

c. Tarski’s Undefinability Theorem

tarskiThis article began with a sketch of the Liar Argument using Liar sentence L. To appreciate the central role in the Liar Argument of Tarski’s rephrasing of Aristotle’s point, we need to examine more than just a sketch of the argument. Alfred Tarski proposed a more formal characterization called Schema T or Convention T:

X is true if, and only if, p,

where “p” is a variable for a grammatical sentence and “X” is a name for that sentence. Here is one instance of that general schema:

“Snow is white” is true if, and only if, snow is white.

It is assumed here that we are building a theory of truth for English, and that we are using English to state the theory.

Tarski was the first person to claim that any theory of truth that could not entail all sentences of this schema would fail to be an adequate theory of truth.

If we were instead to build a theory of truth for German instead of English, but use English to state the theory, then the theory should, among other things, at least entail the T-sentence:

“Der Schnee ist weiss” is true in German if, and only if, snow is white.

A great many philosophers believe Tarski is correct when he claims his Convention T is a necessary condition on any successful theory of truth for any language, and the T sentences should be theorems in the metalanguage. But wait! Do we want all the T-sentences to be entailed and thus come out true? Probably not the T-sentence for the Liar Sentence. That T-sentence is:

T `L´ if and only if L.

Here T is the truth predicate (informally it is the predicate “__ is a true sentence”), and L is the Liar Sentence, namely ~T `L´. Substituting the latter for L on the right of the above biconditional yields the contradiction:

T`L´ if and only if ~T`L´.

That is the argument of the Liar Paradox, very briefly.

Tarski added precision to the discussion of the Liar by focusing not on a natural language such as English but on a classical, interpreted, formal language powerful enough to express at least elementary arithmetic. Here the difficulties produced by the Liar Argument became much clearer; and, very surprisingly, he was able to prove that Convention T, plus the assumption that the language contains its own concept of truth, produces semantic incoherence.

The proof requires the following additional assumptions. Here is a quotation from (Tarski 1944):

I. We have implicitly assumed that the language in which the antinomy is constructed contains, in addition to its expressions, also the names of these expressions, as well as semantic terms such as the term “true” referring to sentences of this language; we have also assumed that all sentences which determine the adequate usage of this term can be asserted in the language. A language with these properties will be called “semantically closed.”

II. We have assumed that in this language the ordinary laws of logic hold.

Tarski claimed that the crucial, unacceptable assumption of the formal version of the Liar Argument is the self-reference allowed by any semantically closed language because any semantically closed language contains its own global truth predicate, and this leads to a contradiction.

To expand on this point, in order for there to be a grammatical and meaningful Liar Sentence in a language, there must be a definable notion of is true which holds for the true sentences and fails to hold for the other sentences. If there were such a global truth predicate, then the predicate __ is a false sentence would also be definable; and [here is where we need the power of elementary number theory] a Liar Sentence would exist, namely a complex sentence ∃x(Qx & ~Tx), where Q and T are predicates that are satisfied by names of sentences. More specifically, T is the one-place, global truth predicate satisfied by all and only the names [that is, numerals for the Gödel numbers] of the true sentences, and Q is a one-place predicate that is satisfied only by the name of ∃x(Qx & ~Tx). But if so, then one can eventually deduce a contradiction. This correct deduction by Tarski is a formal analog of the informal argument of the Liar Paradox.

The contradictory result apparently tells us that the argument began with a false assumption. According to Tarski, the error that causes the contradiction is the assumption that the global truth predicate can be well-defined. Therefore, Tarski asserts that truth is not definable within a classical formal language that is classically interpreted—thus the name Undefinability Theorem or Indefinability Theorem. Tarski’s Theorem establishes that classically interpreted languages capable of expressing elementary arithmetic cannot contain their own global truth predicate, and so cannot be semantically closed.

Truth cannot be defined properly within a classical formal language, but there is no special difficulty in giving a proper definition of truth for a classical formal language, provided it is done outside the language; and Tarski himself was the first person to do this. In 1933, he created the first formal semantics for quantified predicate logic. Here are two imperfect examples of how he partly defines truth. First, the simple sentence Fa is true if, and only if, a is F (that is, a has property F, which in turn requires that a be a member of the extension of predicate F, where the extension is the set of all objects having the property F). For example, we might formalize the English sentence, Alfred is fat, by translating it as Fa; then Tarski is telling us that Alfred is fat just in case Alfred is a member of the set of all things that are fat.

For a second example of partly defining truth, Tarski says the universally quantified sentence ∀xFx is true if, and only if, all the objects in the domain are members of the set of objects that are F.

To repeat, a little more precisely but still imperfectly, Tarski’s theory implies that, if we have a simple, formal sentence `Fa´ in our formal language, say classical predicate logic, in which ‘a’ is the name of some object in the domain of discourse (that is, what we can talk about) and if ‘F’ is a predicate designating a property that perhaps some of those objects have, then ‘Fa‘ is true in the object language if, and only if, a is a member of the set of all things having property F. That set is called the extension of ‘F‘. Tarski also spoke of a satisfyingF‘ this way. For the more complex sentence ‘∀xFx‘ in our language, it is true just in case every object in the domain is in the extension of F.

These two definitions are still imprecise because the appeal to the concept of property should be eliminated, and the definitions should appeal to the notion of formulas being satisfied by sequences of objects. However, ignoring those details, what we have here are two examples of partially defining truth for the formal object language, say language 0, but doing it from outside language 0, in a meta-language, say language 1, namely English that contains some arithmetic and set theory and that might or might not contain language 0 itself. Tarski was able to show that in language 1 we do satisfy Convention T for the object language 0, because the equivalences:

`Fa´ is true in language 0 if, and only if, Fa

`∀xFx´ is true in language 0 if, and only if, ∀xFx

are both deducible in language 1, as are the other T-sentences.

Despite Tarski’s having this success with defining truth for an object language in its meta-language, Tarski’s Undefinability Theorem establishes that there is apparently no hope of defining truth within the object language itself.

Tarski then took on the project of discovering how close he could come to having a well-defined truth predicate within a classical formal language without actually having one. That project, his hierarchy of meta-languages, is also his key idea for solving the Liar Paradox. The project is discussed below.

2. Overview of Ways Out of the Paradox

a. Five Ways Out

There are many proposed solutions to the paradox. A solution which says to quit using language will stop the Liar Paradox; but surely the Liar Paradox can be stopped by making more conservative changes than this radical, ad hoc solution. All other things being equal, adopting simple, intuitive and conservative semantic principles is to be preferred ideally to adopting ad hoc, complicated and less intuitive semantic principles that have many negative consequences. The same goes for revision of a concept or revision of a logic.

So, we will not quit using language. Nor should we try to find a way out by declaring that we must adhere to the principle, Avoid all paradoxes. Saying  that is trivial and unhelplful unless it also gives us other guidance about how to avoid them.

Shall we say instead that the problem is due somehow to the notorious vagueness of English (or whatever natural language is used to create the paradox)? Perhaps. However, more needs to be said because Tarski showed that by using a vagueness-free formal language he could produce the Liar Paradox.

Maybe the route to a solution is to uncover some subtle equivocation in our concepts employed in producing the contradiction. There have been many suggestions along this line, but none have been widely accepted.

Perhaps we should learn to live with paradox. Or perhaps we should simply accept that there is a contradiction unless we make appropriate changes. Because the Liar Paradox depends crucially upon our ideas about how to make inferences and how to understand the key semantic concepts of truth, reference, and negation, one might reasonably suppose that one of these needs revision. But we should proceed cautiously. No one wants to solve the Paradox by being heavy-handed and jettisoning more than necessary. We should be alert to the fact that any changes we do make might have their own drawbacks.

One final word of caution. No doubt the ordinary meaning of the word true is a bit vague, but if we decide to solve the Liar Paradox by revising the concept of truth, then we must remember that explications of true have to be true to some core of ordinary meaning of true lest a revision is so great that it no longer is a revision but instead a change of subject.

If we adopt the metaphor of a paradox as being an argument which starts from the home of seemingly true assumptions and which travels down the garden path of seemingly valid steps into the den of a contradiction, then a solution to the Liar Paradox has to find something wrong with the home, find something wrong with the garden path, or find a way to live within the den. Less metaphorically, the main, systematic ways out of the Paradox are the following:

  1. The Liar Sentence is ungrammatical and so has no truth value (yet the argument of the Liar Paradox depends on it having a truth value).
  2. The Liar Sentence is grammatical but meaningless and so has no truth value.
  3. The Liar Sentence is grammatical and meaningful but still it has no truth value; it falls into the truth gap.
  4. The Liar Sentence is grammatical, meaningful and has a truth value, but one other step in the argument of the Liar Paradox is faulty.
  5. The argument of the Liar Paradox is acceptable, and we need to learn how to live with the Liar Sentence being both true and false.

Two philosophers might take the same way out, but for different reasons.

In presenting any of these five proposed solutions to the Paradox, it is helpful to explore the details and the implications. For example, do they accept, reject or revise the Law of Addition that was appealed to in step 3 of the Liar Argument back in Section 1 of this article? That step permits the deduction of L or Q from L alone. A solution is unacceptable if it cannot answer this question and give the answer a principled justification of some sort.

The five proposed solutions have a key feature in common. They recommend or presuppose logical monism and not logical pluralism. That is, they suppose there is a single, universal logic. This supposition has been challenged by some twentieth century logicians,  although most others remain monists.

There are many suggestions for how to deal with the Liar Paradox, but most are never developed to the point of giving a detailed theory that can speak of its own syntax and semantics with precision. Some give philosophical arguments for why this or that conceptual reform is plausible as a way out of paradox, but then do not show that their ideas can be carried through in a rigorous way. Other attempts at solutions take the formal route and then require changes in standard formalisms so that a formal analog of the Liar Paradox’s argument fails, but then they do not offer a philosophical argument to back up these formal changes other than essentially saying, “It is successful in avoiding paradoxes so far.” A decent theory of truth showing the way out of the Liar Paradox requires both a coherent formalism (or at least a systematic theory of some sort) and a philosophical justification backing it up. The point of the philosophical justification is an unveiling of some hitherto unnoticed or unaccepted rule of language for all sentences of some category which has been violated by the argument of the Paradox. In brief, the philosophical point is that a paradox’s diagnosis should not proceed independently of its rigorous or formal treatment.

Some proponents of their own favorite solution to the paradox agree that a systematic approach to the paradox is valuable, and they point out that in some formalism, say first-order arithmetic, the Liar argument cannot be reconstructed. For one example, perhaps the proponents will argue that the sub-argument from the Liar sentence being true to its being false is acceptable, but the sub-argument from the Liar sentence being false to its being true cannot be reconstructed in their formalism. From this they conclude that the Liar sentence is simply false and paradox-free. This may be the key to solving the Paradox, but it is not successful if there is no satisfactory response to the complaint that perhaps their reconstruction using that formalism shows more about the inadequacy of the formalism than the proper way out of the paradox.

Hartley Slater offers a systematic treatment of the Liar Paradox that does not require formal languages, but that explains why treatments of the Liar with various formalizations, such as Tarski’s project of a hierarchy of metalanguages and his promotion of his Convention T in classical predicate logic, are inadequate. Slater’s systematic treatment concludes that “Indexicality infuses the whole of language, making Tarski’s Truth Scheme inappropriate, and thus resolving the Liar Paradox” (Slater, 2012, p. 85).

This need to have a systematic approach was seriously challenged by Ludwig Wittgenstein in his Philosophical Remarks:

I predict a time when there will be mathematical investigations of calculi containing contradictions, and people will actually be proud of having emancipated themselves from worries about consistency.

In 1938 in a discussion group with Alan Turing on the foundations of mathematics, Wittgenstein said one should try to overcome ”the superstitious fear and dread of mathematicians in the face of a contradiction.” The proper way to respond to any paradox, he said, is by an ad hoc reaction and not by any systematic treatment designed to cure both it and any future ills. Symptomatic relief is sufficient. He said it may appear legitimate, at first, to admit that the Liar Sentence is meaningful and also that it is true or false, but the Liar Paradox shows that one should retract this admission and either just not use the Liar Sentence in any arguments, or say it is not really a sentence, or at least say it is not one that is either true or false. Wittgenstein is not particularly concerned with which choice is made. And, whichever choice is made, he claimed it need not be backed up by any theory that shows how to systematically incorporate the choice. He treated the whole situation cavalierly and unsystematically. After all, he said, the language cannot really be incoherent because we have been successfully using it all along, so why all this fear and dread? Most logicians disagree with Wittgenstein and want systematic removal of the Paradox.

Disagreeing with Wittgenstein, P. F. Strawson has promoted the performative theory of truth as a way out of the Liar Paradox. Strawson has argued that the proper way out of the Liar Paradox is to carefully re-examine how the term truth is really used by speakers. He says such an investigation will reveal that the Liar Sentence is meaningful but fails to express a proposition.

To explore Strawson’s response more deeply, notice that Strawson’s proposed solution depends on the distinction between a proposition and the declarative sentence used to express that proposition. The next section explores what a proposition is, but let us agree for now that a sentence, when uttered, either expresses a true proposition, expresses a false proposition, or fails to express any proposition. According to Strawson, when we say some proposition is true, we are not making a statement about the proposition. We are not ascribing a property to the proposition such as the property of correspondence to the facts, or coherence, or usefulness. Rather, when we call a proposition true, we are only approving it, or praising it, or admitting it, or condoning it. We are performing an action of that sort. Similarly, when we say to our friend, “I promise to pay you fifty dollars,” we are not ascribing some property to the proposition, I pay you fifty dollars. Rather, we are performing the act of promising the $50. For Strawson, when speakers utter the Liar Sentence, they are attempting to praise a proposition that is not there, as if they were saying Ditto when no one has spoken. The person who utters the Liar Sentence is making a pointless utterance. According to this performative theory, the Liar Sentence is grammatical, but it is not being used to express a proposition and so is not something from which a contradiction can be derived. Strawson’s way out has been attractive to some researchers, but not to a majority.

Is it obvious that there is a unique way out? Perhaps the best we can do is to have a variety of ways out, some of which are better than some others in certain respects. That point should be kept in mind when this article cavalierly speaks of the way out.

b. Sentences, Statements, and Propositions

The Liar Paradox can be expressed in terms of sentences, statements, and propositions.

The Strengthened Liar might begin with any of these:

  • This sentence is not true.
  • This statement is not true.
  • This proposition is not true.
  • This is not true.

The sentence “I like that” can assert two very different propositions when asserted on two different occasions, one in which the word “that” refers to the dog on the mat, and the one in which the same word refers to the cat on the mat. And two sentences can express the same proposition, such as when someone says both, “I like that” and “I like the cat on the mat.”

Sentences are linguistic expressions, whereas statements and propositions are not. A proposition is usually said to be the content of a meaningful sentence. We sometimes use sentences to make statements and assert propositions, but we sometimes use sentences to ask questions and to threaten our enemies. When speaking about sentences, we usually are speaking about sentence types, not tokens. Tokens are the sound waves or the ink marks or the electronic events. Types are what is the same when we say that the same sentence was spoken by John, recorded in ink in his notebook, and sent over the Internet to his friend. In the process of asserting the Strengthened Liar sentence, the person is using a token of the word this to refer to a special sentence type, namely to the Strengthened Liar sentence. In the process of asserting the Strengthened Liar proposition, the person is using a token of the word this to refer to the meaningful content of a special sentence type, namely to the Strengthened Liar sentence.

This is a bit vague, but it is difficult to remove the vagueness. Philosophers disagree with each other about what a statement is, and they disagree even more about what a proposition is. Most philosophers will say that sentences do not themselves make statements. Rather it is we speakers who use sentences to make statements. Some philosophers will claim that it is statements or propositions that are primarily true or false, and a sentence is true or false only in a secondary sense. But other philosophers disagree and believe that it is sentences that are primarily true or false.

Despite Quine’s famous complaint that there are no propositions because there can be no precise criteria for deciding whether two different sentences are being used to express identical propositions, there are some very interesting reasons why researchers who work on the Liar Paradox should focus on propositions rather than on either sentences or statements, but those reasons are not explored here. John Corcoran suggests the following position:

A judgment is a private act that results in a belief; a statement is a public event usually involving a sentence. Each judgment and each statement is performed by a unique person at a unique time and place. Propositions and sentences are timeless and placeless abstractions. A proposition is an intensional entity; it is a meaning composed of concepts. A sentence is a linguistic entity. A written sentence is a string of characters. A sentence can be used by a person to express meanings, but no sentence is intrinsically meaningful. Only propositions are properly said to be true or to be false—in virtue of facts, which are subsystems of the universe (Corcoran 2009, p. 71).

For a discussion of the need for propositions, see (Barwise and Etchemendy 1987). The present article continues to speak primarily of sentences rather than propositions, though only for the purpose of simplicity.

c. An Ideal Solution to the Liar Paradox

Ideally, we would like a proposed solution to the Liar Paradox to provide a solution to all the versions of the Liar Paradox, such as the Strengthened Liar Paradox, the version that led to Buridan’s proof of God’s existence, and the contingent versions of the Liar Paradoxes. The solution should solve the paradox both for natural languages and formal languages, or provide a good explanation of why the paradox can be treated properly only in one but not the other. The contingent versions of the Liar Paradox are going to be troublesome because, if the production of the paradox does not depend only on something intrinsic to the sentence but also depends on what circumstances occur in the world, then there needs to be a detailed description of when those circumstances are troublesome and when they are not, and why.

It would be ideal if we had a solution to both the Liar Paradox and Curry’s Paradox, another paradox that turns on self-reference. Haskell Curry’s paradox concerns the following sentence C:

If C is true then ⊥.

The sentence C above contains itself. The symbol “⊥” abbreviates a contradiction. This leads to a paradox because one instance of Tarski’s Convention T is the equivalence:

C is true iff C.

Substituting Curry’s definition of C for the second C on the right yields:

C is true iff if C is true then ⊥.

Now let us begin to construct a multi-step Conditional Proof. Assume that C is true. Then, because of the last equivalence, if C is true then ⊥. So, by modus ponens, ⊥. Hence, by Conditional Proof, we have established that:

if C is true then ⊥.

By the definition of C, this is:

C.

Thus, by the first equivalence above, because we have established its right side:

C is true.

Therefore, by modus ponens on the previous two steps, we may infer:

⊥.

So, we have proved a contradiction. The outcome is a self-referential paradox that does not rely on negation, as the Liar Paradox does.

To have an ideal solution to the Liar Paradox, it would be reasonable to require a solution not only to the Curry Paradoxes but also to the Yablo Paradox  which is Liar-like and Curry-like but which apparently does not rely on self-reference. In Stephen Yablo’s paradox, there is no way to coherently assign a truth value to any of the sentences in the countably infinite sequence of sentences of the form, None of the subsequent sentences are true. Imagine an unending line of people in numerical order who say, and only say, simultaneously:

1. Everybody after me is lying.

2. Everybody after me is lying.

3. Everybody after me is lying.

Ask yourself whether the first person’s sentence in the sequence is true or false. To produce the paradox it is crucial that the line of speakers be infinite. Notice that no sentence overtly refers to itself. There is controversy in the literature about whether the paradox actually contains a hidden appeal to self-reference or circularity. See (Beall 2001) for more discussion.

To summarize, an important goal for the best solution, or solutions, to the Liar Paradox is to offer us a deeper understanding of how our semantic concepts and principles worked to produce the Paradox in the first place, especially if a solution to the Paradox requires changing them. We want to understand the concepts of truth, reference, and negation that are involved in the Liar Paradox. In addition to these, there are the subsidiary principles and related notions of denial, definability, naming, meaning, predicate, property, presupposition, antecedent, and operating on prior sentences to form newer meaningful ones rather than merely newer grammatical ones. We would like to know what limits there are on all these notions and mechanisms, and how one impacts another.

What are the important differences among the candidates for bearers of truth? The leading candidates are sentences, propositions, statements, claims, and utterances. Is one primary, while the others are secondary or derivative? Ideally, we would like to know a great deal more about truth, but also falsehood and the related notions of fact, situation and state of affairs. We want to better understand what a language is and what the relationship is between an interpreted formal language and a natural language, relative to different purposes. Finally, it would be instructive to learn how the Liar Paradoxes are related to all the other paradoxes.

That may be quite a lot to ask, but then our civilization does have some time to investigate all this before the Sun expands and vaporizes our little planet.

d. Should Classical Logic be Revised?

An important question regarding the Liar Paradox is: What is the relationship between a solution to the Paradox for (interpreted) formal languages and a solution to the Paradox for natural languages? There is significant disagreement on this issue. Is appeal to a formal language a turn away from the original problem, and so just changing the subject? Can one say we are still on the subject when employing a formal language because a natural language contains implicitly within it some formal language structure? Or should we be in the business of building an ideal language to replace natural language for the purpose of philosophical study?

Is our natural language, for example, English, a semantically closed language? Does English have one or more logics? Should we conclude from the Liar Paradox that the logic of English cannot be standard logic but must be one that restricts the explosion that occurs due to our permitting the deduction of anything whatsoever from a contradiction? Should we say English really has truth gaps or perhaps occasional truth gluts (sentences that are both true and false)? So many questions.

Or instead can a formal language be defended on the ground that natural language is inconsistent and the formal language is showing the best that can be done rigorously? Can sense even be made of the claim that a natural language is inconsistent, for is not consistency a property only of languages with a rigorous structure, namely formal languages and not natural languages? Should we say people can reason inconsistently in natural language without declaring the natural language itself to be inconsistent? This article raises, but will not resolve, these questions, although some are easier to answer than others.

Many of the most important ways out of the Liar Paradox recommend revising classical formal logic. Classical logic is the formal logic known to introductory logic students as Predicate Logic in which, among other things, (i) all sentences of the formal language have exactly one of two possible truth values (TRUE, FALSE), (ii) the rules of inference allow one to deduce any sentence from an inconsistent set of assumptions, (iii) all predicates are totally defined on the range of the variables, and (iv) the formal semantics is the one invented by Tarski that provided the first precise definition of truth for a formal language in its metalanguage. A few philosophers of logic argue against any revision of classical logic by saying classical logic is the incumbent formalism that should be accepted unless an alternative is required (probably it is believed to be incumbent because of its remarkable success in expressing most of modern mathematical inference). Still, most other philosophers argue that classical logic is not the incumbent which must remain in office unless an opponent can dislodge it. Instead, the office has always been vacant.

In the decades since Tarski’s treatment of the Liar Paradox, there have been many new approaches that reject his classical, extensional logic in favor of alternative logics that do not require that his T-sentences be theorems of the metalanguage.

One critic of classical formal logic, Hartley Slater, says the usual formal languages fail at the crucial point of properly treating indexicals, words whose reference changes with context:

It is a recognition of the previous points about indexicality and sentence nominalisations that gets one out of the Liar… [B]ut the Truth Scheme “‘p’ is true ≡ p” does not apply when indexicals are involved, since one cannot say: ‘He is happy’ is true ≡ he is happy. (Slater 2012, p. 72)

Some philosophers object to revising classical logic if the purpose in doing so is merely to find a way out of the Paradox. They say that philosophers should not build their theories by attending to the queer cases. There are more pressing problems in the philosophy of logic and language than finding a solution to the Paradox, so any treatment of it should wait until these problems have a solution. From the future resulting theory which solves those problems, one could hope to deduce a solution to the Liar Paradox. However, for those who believe the Paradox is not a minor problem but is one deserving of immediate attention, there can be no waiting around until the other problems of language are solved. Perhaps the investigation of the Liar Paradox will even affect the solutions to those other problems.

3. Assessing the Five Ways Out

There have been many systematic proposals for ways out of the Liar Paradox. Below is a representative sample of five of the main ways out.

a. Russell’s Type Theory

Bertrand Russell said natural language is incoherent, but its underlying sensible part is an ideal formal language (such as the applied predicate logic of Principia Mathematica). He agreed with Henri Poincaré that the source of the Liar trouble is its use of self-reference. Russell’s way out was to rule out self-referential sentences as being ungrammatical or not well-formed in his ideal language.

In 1908 in his article “Mathematical Logic as Based on the Theory of Types” that is reprinted in (Russell 1956, p. 79), Russell solves the Liar with his ramified theory of types. This is a formal language involving an infinite hierarchy of, among other things, orders of propositions:

If we now revert to the contradictions, we see at once that some of them are solved by the theory of types. Whenever ‘all propositions’ are mentioned, we must substitute ‘all propositions of order n’, where it is indifferent what value we give to n, but it is essential that n should have some value. Thus when a man says ‘I am lying’, we must interpret him as meaning: ‘There is a proposition of order n, which I affirm, and which is false’. This is a proposition of order n+1; hence the man is not affirming any propositions of order n; hence his statement is false, and yet its falsehood does not imply, as that of ‘I am lying’ appeared to do, that he is making a true statement. This solves the liar.

Russell’s implication is that the informal Liar Sentence is meaningless because it has no appropriate translation into his formal language since an attempted translation violates his type theory. This theory is one of his formalizations of the Vicious-Circle Principle: Whatever involves all of a collection must not be one of the collection. Russell believed that violations of this principle are the root of all the logical paradoxes.

His solution to the Liar Paradox has the drawback that it places so many subscript restrictions on what can refer to what. It is unfortunate that the Russell hierarchy requires even the apparently harmless self-referential sentences This sentence is in English and This sentence is not in Italian to be syntactically ill-formed. The type theory also rules out explicitly saying (within his formalism) that legitimate terms must have a unique type, or saying that properties have the property of belonging to exactly one category in the hierarchy of types, which, if we step outside the theory of types, seems to be true about the theory of types. Bothered by this, Tarski took a different approach to the Liar Paradox.

b. Tarski’s Hierarchy of Meta-Languages

Reflection on the Liar Paradox suggests that either informal English (or any other natural language) is not semantically closed or, if it is semantically closed as it appears to be, then it is inconsistent—assuming for the moment that it does make sense to apply the term inconsistent to a natural language with a vague structure. Because of the vagueness of natural language, Tarski quit trying to find the paradox-free structure within natural languages and concentrated on developing formal languages that did not allow the deduction of a contradiction, but which diverge from natural language as little as possible.

One virtue of Tarski’s way out of the Liar Paradox is that it does permit the concept of truth to be applied to sentences that involve the concept of truth, provided we apply level subscripts to the concept of truth and follow the semantic rule that any subscript inside a pair of quotation marks must always be smaller than the subscript outside but still within the sentence; any violation of this rule produces a meaningless, ungrammatical formal sentence. Let language of level 1 be the meta-language of the object language that is in or at level 0. Level 0 sentences do not contain truth or similar terms, but would contain, say, Paris is the capital of France. The sentence saying this level 0 sentence is true occurs in level 1. It would be: Paris is the capital of France is true0. No sentence is allowed to contain its own truth predicate.

The rule for subscripts stops the formation of both the Classical Liar Sentence and the Strengthened Liar Sentence anywhere within the hierarchy. The subscripting also stops paradoxical chains that start as follows:

The next sentence is true.

The previous sentence is false.

Another virtue of the Tarski way out is that it provides a way out of the Yablo Paradox.

Russell’s solution calls This sentence is in English ill-formed, but Tarski’s solution does not, so that feature is also virtue of Tarski’s way out. Tarski allows some self-reference, but not the self-reference involved in the Liar Paradox.

Tarski’s clever treatment of the Liar Paradox unfortunately has drawbacks. English has a single word true, but Tarski is replacing this with an infinite sequence of truth-like formal predicates, each of which is satisfied by the truths only of the language below it in the hierarchy of languages. Intuitively, a more global truth predicate should be expressible in the language it applies to. One hopes to be able to talk truly about one’s own semantic theory. The Tarski way out does not allow us even to say that in all languages of the hierarchy, some sentences are true. To use Wittgenstein’s phrase from his Tractatus, the character of the hierarchy can be shown but not said.

Despite these restrictions and despite the unintuitive and awkward hierarchy, Quine defends Tarski’s way out as the best of the ways. Here is Quine’s defense:

Revision of a conceptual scheme is not unprecedented. It happens in a small way with each advance in science, and it happens in a big way with the big advances, such as the Copernican revolution and the shift from Newtonian mechanics to Einstein’s theory of relativity. We can hope in time even to get used to the biggest such changes and to find the new schemes natural. There was a time when the doctrine that the earth revolves around the sun was called the Copernican paradox, even by the men who accepted it. And perhaps a time will come when truth locutions without implicit subscripts, or like safeguards, will really sound as nonsensical as the antinomies show them to be. (Quine 1976)

Tarski adds to the defense by stressing that:

The languages (either the formalized languages or—what is more frequently the case—the portions of everyday language) which are used in scientific discourse do not have to be semantically closed. (Tarski, 1944)

One criticism of Quine is that he is asking us to be patient and not to be so bothered by the complexity of the hierarchy, but he is giving no other justification for the hierarchy.

(Kripke 1975) criticized Tarski’s way out for its inability to handle contingent versions of the Liar Paradox such as one that begins with:

It is raining and this sentence is false

because Tarski cannot describe the contingency. That is, Tarski’s solution does not provide a way to specify the circumstances in which a sentence does leads to a paradox and the other circumstances it does not.

Putnam also criticized Tarski’s way out for its quietism about its own semantics:

The paradoxical aspect of Tarski’s theory, indeed of any hierarchical theory, is that one has to stand outside the whole hierarchy even to formulate the statement that the hierarchy exists. But what is this “outside place”—“informal language”—supposed to be? It cannot be “ordinary language,” because ordinary language, according to Tarski, is semantically closed and hence inconsistent. But neither can it be a regimented language, for no regimented language can make semantic generalizations about itself or about languages on a higher level than itself. (Putnam 1990, 13)

Within Tarski’s hierarchy of formal languages, we cannot say, Every language has true sentences (because no sentence can contain its own truth predicate in Tarski’s hierarchy) even though outside the hierarchy this is clearly a true remark about the hierarchy.

c. Kripke’s Hierarchy of Interpretations

Kripke’s way out of the Classical Liar Paradox requires a revision in our semantic principles but a less radical one than does the Russell solution or the Tarski-Quine solution. Kripke rejects the hierarchy of languages and retains the intuition that there is a single, semantically coherent and meaningful Liar Sentence, but argues that it is neither true nor false and so falls into a truth value gap. Kripke successfully develops the details using the tools of symbolic logic. Tarski’s Undefinability Theorem does not apply to languages having sentences that are neither true nor false. So, it can be argued that Kripke successfully shows that a semantically coherent formal language can contain its own global truth predicate in the sense that T(‘p’) is true whenever p is true, and is undefined if p is undefined. Not surprisingly, the negation of the truth predicate T does not quite express the concept of “not true” in the sense of meaning “false or undefined,” and so  Kripke’s way out has a difficulty with the strengthened liar argument.

Let’s explore Kripke’s theory of truth in a bit more detail. He trades Russell’s and Tarski’s infinite syntactic complexity of languages for infinite semantic complexity of a single formal language. He rejects Tarski’s infinite hierarchy of meta-languages in favor of one formal language having an infinite hierarchy of partial interpretations. Consider a single formal language capable of expressing elementary number theory and containing a predicate T for truth (that is, for truth in an interpretation). Kripke assigns to T an elaborate interpretation, namely its extension (the set of sentences it is true of), its anti-extension (the set of sentences it is false of), and its undecideds (the set of sentences it is neither true nor false of). No sentence is allowed to be a member of both the extension and anti-extension of any predicate. Kripke allows the interpretation of T to change throughout the hierarchy. The basic predicates except the T predicate must have their interpretations already fixed in this base level. In the base level of the hierarchy, the predicate T is given a special extension and anti-extension. Specifically, its extension is all the (names of the) true sentences that do not actually contain the predicate symbol ‘T’, and its anti-extension is all the false sentences that do not contain ‘T‘. The predicate ‘T‘ is the formal language’s only basic partially-interpreted predicate.

As we ascend the hierarchy, distancing ourselves from the basic level, more and more complex sentences involving the symbol ‘T‘ get added into the extension and anti-extension of the intended truth predicate T. Each step up Kripke’s semantic hierarchy is another partial interpretation of the language. As we go up a level we add into the extension of T all the true sentences containing T from the lower level. Ditto for the anti-extension.

For example, at the lowest level in the hierarchy we have the (formal equivalent of the) true sentence 7 + 5 = 12. Strictly speaking it is not grammatical in English to say 7 + 5 = 12 is true because we make a use-mention error. More properly we should add quotation marks and say ‘7 + 5 = 12’ is true. In Kripke’s formal language, ‘7 + 5 = 12’ is true at the base level of the hierarchy. Meanwhile, the sentence that is the best candidate for saying it is true, namely ‘T(‘7+5=12’)’, is not true at that level, although it is added to the extension of T and thus is said to be true at the next higher level. Unfortunately at this new level, the even more syntactically complex sentence ‘T(‘T(‘7+5=12’)’)’ is still not yet true. It will become true at the next higher level. And so goes the hierarchy of interpretations as it attributes truth to more and more sentences involving the concept of truth itself. The extension of T, that is, the class of names of sentences that satisfy T, grows but never contracts as we move up the hierarchy, and it grows by calling more true sentences true. Similarly the anti-extension of T grows but never contracts as more false sentence involving T are correctly said to be false.

Kripke shows that T eventually becomes a truth-like predicate for its own level when the interpretation-building reaches the unique lowest fixed point at a countably infinite height in the hierarchy. At a fixed point, no new sentences are declared true or false, and at this level Kripke shows that the language also satisfies Tarski’s Convention T, so for this reason many philosophers are sympathetic to Kripke’s controversial claim that T is a truth predicate at that point. At this fixed point, the formal equivalent of the Liar Sentence still is neither true nor false, and so falls into the truth gap, just as Kripke set out to show. In this way, the Liar Paradox is solved, the formal language has a global truth predicate, the formal semantics is coherent, and many of our intuitions about semantics are preserved.

However, there are difficulties with Kripke’s way out. His treatment of the Classical Liar stumbles on the Strengthened Liar and reveals why that paradox deserves its name. For a discussion of why, see (Kirkham 1992, pp. 293-4).

Some critics of Kripke’s theory say that in the fixed-point the Liar Sentence does not actually contain a global truth predicate but rather only a clever restriction on the truth predicate, and so Kripke’s Liar Sentence is not really the Liar Sentence after all; therefore we do not have here a solution to the Liar Paradox. Other philosophers say this is not a fair criticism of Kripke’s theory since Tarski’s Convention T, or some other intuitive feature of our concept of truth, must be restricted in some way if we are going to have a formal treatment of truth.

What can more easily be agreed upon by the critics is that Kripke’s candidate for the Liar sentence falls into the truth gap in Kripke’s theory at all levels of his hierarchy, so it is not true in his theory. [We are making this judgment that it is not true from within the meta-language in which sentences are properly said to be true or else not true.] However, in the object language of the theory, one cannot truthfully say the Liar Sentence is not true since the obvious candidate expression for that, namely ~Ts, is not true, but rather falls into the truth gap. Therefore, Kripke’s truth-gap theory cannot state its own thesis.

Robert Martin and Peter Woodruff created the same way out as Kripke, though a few months earlier and in less depth.

d. Barwise and Etchemendy

Another way out says the Liar Sentence is meaningful and is true or else false, but one special step of the argument in the Liar Paradox is incorrect, namely, the inference from the Liar Sentence’s being false to its being true. Arthur Prior, following the informal suggestions of Jean Buridan and C. S. Peirce, takes this way out and concludes that the Liar Sentence is simply false. So do Jon Barwise and John Etchemendy, but they go on to present a detailed, formal treatment of the Paradox that depends crucially upon using propositions rather than sentences. The details of their treatment will not be sketched here. Their treatment says the Liar Proposition is simply false on one interpretation but simply true on another interpretation, and that the argument of the Paradox improperly exploits this ambiguity. The key ambiguity is to conflate the Liar Proposition’s negating itself with its denying itself. Similarly, in ordinary language we are not careful to distinguish asserting that a proposition is false from denying that it is true.

Three positive features of the Barwise-Etchemendy solution are that (i) it applies to the Strengthened Liar, (ii) its propositions are always true or false, but never both, and (iii) it shows the way out of paradox both for natural language and interpreted formal language. Yet there is a price to pay. No proposition in their system can be about the whole world, and this restriction is there for no independent reason but only because otherwise we would get a paradox.

e. Paraconsistency

A more radical way out of the Paradox is to argue that the Liar Sentence is both true and false. This solution is a version of dialethism, the thesis that some contradictions are true. It embraces the Liar contradiction, then tries to limit the damage that is ordinarily a consequence of that embrace. This way out changes the classical rules of semantics in two ways: (1) it allows the Liar Sentence to be both true and false, and (2) it limits the damage by preventing the semantic incoherence that occurs from allowing everything to follow from any contradiction. The damaging principle of classical logic, called Explosion, is: (p & ~p) ⊧ q. A logic for which Explosion fails is called a paraconsistent logic.

This way out was initially promoted primarily by Graham Priest in 1979. It succeeds in avoiding semantic incoherence while offering a formal, detailed treatment of the Paradox. Priest is not a logical pluralist, and he proposes that there is one true paraconsistent logic. One noteworthy feature of Priest’s truth-glut semantics is that it is the same as Kleene’s strong three-valued semantics with truth-gaps if we apply this translation scheme:

Kleene Priest
True True only
False False only
No Truth Value Both True and False

In formalizing reasoning with paradoxical sentences in Priest’s theory, a paradoxical sentence will imply some sentence P & ~P in the object language; but using Tarski’s T-scheme, this transforms immediately into:

P is true and P is not true

so the contradiction propagates into the metalanguage.

A principal virtue of the paraconsistency treatment is that, unlike with Barwise and Etchemendy’s treatment, a sentence can be about the whole world. Critics of this approach to the Liar have complained that it does not seem to solve the Strengthened Liar Paradox, nor Curry’s Paradox; and it does violence to our intuition that sentences cannot be both true and false in the same sense in the same situation. See the last paragraph of “Paradoxes of Self-Reference,” for more discussion of using paraconsistency as a way out of the Liar Paradox.

4. Conclusion

To summarize, when we treat the Liar Paradox we should provide two things, an informal diagnosis which pinpoints the part of the paradox’s argument that has led us astray, and a formalism that prevents the occurrence of the paradox’s argument within that formalism.

Russell, Tarski, Kripke, Barwise-Etchemendy, and Priest (among many others) deserve credit for providing a philosophical justification for their proposed solutions while also providing a formal treatment in symbolic logic that shows in detail both the character and implications of their proposed solutions. The theories of Russell and of Quine-Tarski do provide a treatment of the Strengthened Liar, but at the cost of assigning complex levels to the relevant sentences. On the positive side, their treatment does not take Russell’s radical step of ruling out all self-reference. Kripke’s elegant and careful treatment of the Classical Liar stumbles on the Strengthened Liar. Barwise and Etchemendy’s way out avoids these problems, but requires accepting the idea that no sentence can be used to say anything about the whole world, including the semantics of our language. Priest’s way out requires giving up our intuition that no context-free, unambiguous sentence is both true and false.

In conclusion, it appears that more work needs to be done in finding the best way, or the best ways, out of the Liar Paradox that will preserve the most important intuitions we have about semantics while avoiding semantic incoherence. In this vein, one can draw a pessimistic conclusion and an optimist conclusion. Taking the pessimistic route, Putnam says:

If you want to say something about the liar sentence, in the sense of being able to give final answers to the questions “Is it meaningful or not? And if it is meaningful, is it true or false? Does it express a proposition or not? Does it have a truth value or not? And which one?” then you will always fail. In closing, let me say that even if Tarski was wrong (as I believe he was) in supposing that ordinary language is a theory and hence can be described as “consistent” or “inconsistent,” and even if Kripke and others have shown that it is possible to construct languages that contain their own truth-predicates, still, the fact remains that the totality of our desires with respect to how a truth-predicate should behave in a semantically closed language, in particular, our desire to be able to say without paradox of an arbitrary sentence in such a language that it is true, or that it is false, or that it is neither true nor false, cannot be adequately satisfied. The very act of interpreting a language that contains a liar sentence creates a hierarchy of interpretations, and the reflection that this generates does not terminate in an answer to the questions “Is the liar sentence meaningful or meaningless, or if it is meaningful, is it true or false?” (Putnam 2000)

In (Putnam 2012,p. 206), Putnam concluded that “a solution does not seem to be possible” if by a solution, we mean one that makes all appearance of paradox go away.

More optimistically, should there really be so much fear and loathing about limitations on our ability to formally express all the theses of our favored theory? Many fields have learned to live with their limitations. ZFC set theory cannot speak of the set of all its sets, but it remains a fruitful theory.

See also Logical Paradoxes.

5. References and Further Reading

For further reading on the Liar Paradox that provides more of an introduction to it while not presupposing a strong background in symbolic logic, the author recommends reading the article below by Mates, plus the first chapter of the Barwise-Etchemendy book, and then chapter 9 of the Kirkham book. The rest of this bibliography is a list of contributions to research on the Liar Paradox, and all members of the list require the reader to have significant familiarity with the techniques of symbolic logic. In the formal, symbolic tradition, other important researchers in the last quarter of the 20th century when research on the Liar increased dramatically were Burge, Gupta, Herzberger, McGee, Parsons, Putnam, Routley, Skyrms, van Fraassen, and Yablo.

  • Barwise, Jon and John Etchemendy. The Liar: An Essay in Truth and Circularity, Oxford University Press, 1987.
  • Beall, J.C. (2001). “Is Yablo’s Paradox Non-Circular?” Analysis 61, no. 3, pp. 176-87.
  • Burge, Tyler. “Semantical Paradox,” Journal of Philosophy, 76 (1979), 169-198.
  • Corcoran, John. “Sentence, Proposition, Judgment, Statement, and Fact: Speaking about the Written English Used in Logic” in W. A. Carnielli (ed.), The Many Sides of Logic, College Publications. pp. 71-103. 2009.
  • Dowden, Bradley. “Accepting Inconsistencies from the Paradoxes,” Journal of Philosophical Logic, 13 (1984), 125-130.
  • Gupta, Anil. “Truth and Paradox,” Journal of Philosophical Logic, 11 (1982), 1-60. Reprinted in Martin (1984), 175-236.
  • Herzberger, Hans. “Paradoxes of Grounding in Semantics,” Journal of Philosophy, 68 (1970), 145-167.
  • Kirkham, Richard. Theories of Truth: A Critical Introduction, MIT Press, 1992.
  • Kripke, Saul. “Outline of a Theory of Truth,” Journal of Philosophy, 72 (1975), 690-716. Reprinted in (Martin 1984).
  • Martin, Robert. The Paradox of the Liar, Yale University Press, Ridgeview Press, 1970. 2nd ed. 1978.
  • Martin, Robert. Recent Essays on Truth and the Liar Paradox, Oxford University Press, 1984.
  • Martin, Robert and Peter Woodruff. “On Representing ‘True-in-L’ in L,” Philosophia, 5 (1975), 217-221.
  • Mates, Benson.  Skeptical Essays, The University of Chicago Press, 1981. See especially “Two Antinomies,” on pages 15-57.
  • McGee, Vann. Truth, Vagueness, and Paradox: An Essay on the Logic of Truth, Hackett Publishing, 1991.
  • Parson, Charles. “The Liar Paradox,” Journal of Philosophical Logic 3 (1974): 381-412.
  • Priest, Graham. “The Logic of Paradox,” Journal of Philosophical Logic, 8 (1979), 219-241; and “Logic of Paradox Revisited,” Journal of Philosophical Logic, 13 (1984), 153-179.
  • Priest, Graham, Richard Routley, and J. Norman (eds.). Paraconsistent Logic: Essays on the Inconsistent, Philosophia-Verlag, 1989.
  • Prior, Arthur N. “Epimenides the Cretan,” Journal of Symbolic Logic, 23 (1958), 261-266.
  • Prior, Arthur N. “On a Family of Paradoxes,” Notre Dame Journal of Formal Logic, 2 (1961), 16-32.
  • Putnam, Hilary. Realism with a Human Face, Harvard University Press, 1990.
  • Putnam, Hilary. “Paradox Revisited I: Truth.” In Gila Sher and Richard Tieszen, eds., Between Logic and Intuition: Essays in Honor of Charles Parsons, Cambridge University Press, (2000), 3-15.
  • Putnam, Hilary. Philosophy in an Age of Science: Physics, Mathematics, and Skepticism. Harvard University Press, 2012.
  • Quine, W. V. O. “The Ways of Paradox,” in his The Ways of Paradox and Other Essays, rev. ed., Harvard University Press, 1976.
  • Russell, Bertrand. “Mathematical Logic as Based on the Theory of Types,” American Journal of Mathematics, 30 (1908), 222-262.
  • Russell, Bertrand. Logic and Knowledge: Essays 1901-1950, ed. by Robert C. Marsh, George Allen & Unwin Ltd. (1956).
  • Skyrms, Brian. “Return of the Liar: Three-valued Logic and the Concept of Truth,” American Philosophical Quarterly, 7 (1970), 153-161.
  • Slater, Hartley. “Logic is Not Mathematical,” Polish Journal of Philosophy, Spring 2012, pp. 69-86.
  • Strawson, P. F. “Truth,” in Analysis, 9, (1949).
  • Tarski, Alfred. “The Concept of Truth in Formalized Languages,” in Logic, Semantics, Metamathematics, pp. 152-278, Clarendon Press, 1956.
  • Tarski, Alfred. “The Semantic Conception of Truth and the Foundations of Semantics,” in Philosophy and Phenomenological Research, Vol. 4, No. 3 (1944), 341-376.
  • Van Fraassen, Bas. “Truth and Paradoxical Consequences,” in (Martin 1970).
  • Woodruff, Peter. “Paradox, Truth and Logic Part 1: Paradox and Truth,” Journal of Philosophical Logic, 13 (1984), 213-231.
  • Wittgenstein, Ludwig. Remarks on the Foundations of Mathematics, Basil Blackwell, 3rd edition, 1978.
  • Yablo, Stephen. (1993). “Paradox without Self-Reference.” Analysis 53: 251-52.

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.

The Sheffer Stroke

The Sheffer Stroke is one of the sixteen definable binary connectives of standard propositional logic. The stroke symbol is “|” as in

    \[(p \mid q) \leftrightarrow (\neg p \vee \neg q)\]

The linguistic expression whose logical behavior is presumed modeled by this logical connective is the truth-functional phrase “not both,” from which the name NAND originates.

All sixteen connectives interpret associated functions of the Boolean algebra. In the theory of electronic circuits the Boolean functions are implemented by electronic or logic gates: the gate implementing the associated function of the Sheffer Stroke is called NAND and is known as a “universal gate.” The Sheffer Stroke has the remarkable metalogical property known as functional completeness (more precisely, weak functional completeness.) A connective is functionally complete (more precisely, weakly functionally complete) for a formal language L if and only if all mathematically definable connectives of L (except for the zeroary connectives or constants) can be defined by using that connective as the only connective. In using the familiar truth table for the semantics of the standard propositional logic, the functional completeness of the Sheffer Stroke means that, for every truth table labeled by a well-formed formula of the logic, there is an identical truth table whose labeling formula has the Sheffer Stroke symbol as the only connective symbol; or, every definable connective can be defined by a truth table that is labeled by a formula that has the Sheffer Stroke as its only connective symbol. (Two truth tables are identical if they agree on every truth value output corresponding to the same truth value input assignments.) The same observations about functional completeness apply to the case of the Peirce Arrow, which is the dual of the Sheffer Stroke.

The discovery of the Sheffer Stroke was achieved independently by Henry M. Sheffer in 1913 after it had been realized previously by Charles Sanders Peirce, as attested by a fragment written in 1880 (and, again, in 1902). This landmark discovery was hailed by such seminal figures in the history of logic as Ludwig Wittgenstein and Bertrand Russell.

An elegant result due to Emile Post (1941) makes it possible to account for the property of functional completeness of the Sheffer Stroke on the grounds that it is lacking certain characteristic “hereditary” properties. This is examined in detail in the present article.

The logical-philosophic significance of the availability of a Sheffer function was taken by Ludwig Wittgenstein (in the Tractatus Logico-Philosophicus, 1922) to consist in its perspicuous illustration of deeper features of formal logic. In natural languages, the phrases whose logical behavior is captured by the Sheffer Stroke and the Peirce Arrow are, respectively, “not both” and “neither-nor”: these seem rather unremarkable, but this is a sign that what is at stake in functional completeness investigations is characteristically related to the study of formal logic and is not relevant to the goals of studying natural languages.

 

Table of Contents

  1. The Sheffer Stroke and Its Place in Propositional Logic
    1. Alternative Definitions of the Sheffer Stroke
    2. Decision-Procedural Rules for the Sheffer Stroke
    3. Alternative Symbols
  2. History
    1. Peirce’s Discovery
    2. The “Discovery” and Principia Mathematica
  3. The Logical Connectives of Standard Propositional Logic and the Sheffer Stroke
  4. Properties of the Sheffer Stroke
  5. Significance of the Sheffer Stroke for Mathematical Logic, Philosophical Logic,
    and Philosophy

    1. Wittgenstein’s Tractatus and the Sheffer Stroke
  6. References and Further Reading

1. The Sheffer Stroke and Its Place in Propositional Logic

The linguistic phrase whose logical behavior is traced through the Sheffer Stroke is the truth-functional expression “not both ___ and —,” and its logical equivalent is “either not ___ or not —.” The term “Sheffer Stroke” is the name of the symbol “ \mid ” denoting the binary logical connective of the standard Propositional Logic that is usually called Sheffer Stroke. Thus, the name Sheffer Stroke is used not simply for the symbol but also for the logical connective itself. This article refers to the logical connective indifferently as the Sheffer Stroke trusting that context removes any ambiguity between the connective and its symbol.

Other names of the logical connective are Alternate Denial, NAND and Negated Conjunction. Readers of older Logic textbooks are likely to find the connective called Alternate Denial. There is another, related, logical connective of standard propositional logic, called NOR, Joint Denial, Negated Disjunction, and Joint Exclusion; it is sometimes called Peirce’s Arrow or Quine’s Dagger (although the last two, as with “Sheffer Stroke” are, strictly speaking, names of symbols used for that connective.) The relationship between the Sheffer Stroke and NOR is deep and has profound interest, which will be explored in this article.

Other names of the logical connective are Alternate Denial, NAND and Negated Conjunction. Readers of older Logic textbooks are likely to find the connective called Alternate Denial. There is another, related, logical connective of standard propositional logic, called NOR, Joint Denial, Negated Disjunction, and Joint Exclusion; it is sometimes called Peirce’s Arrow or Quine’s Dagger (although the last two, as with “Sheffer Stroke” are, strictly speaking, names of symbols used for that connective.) The relationship between the Sheffer Stroke and NOR is deep and has profound interest, which will be explored in this article.

The term “Sheffer Stroke” refers also to the symbol used to denote the logical connective that has the same name; this connective is also known by other names, as will be seen. This article speaks of logical connectives. Strictly speaking, the Sheffer Stroke connective is the semantic analogue of a definable binary Boolean function which can be called the associated Boolean function of the connective. This Boolean function is known as NAND in the theory of electronic or logic gates, where it serves as one of two universal gates; precisely speaking, the physical gate is an instance of implementation of the Boolean function whose propositional-logic interpretation is the Sheffer Stroke. That interpretation is not examined in the present article.

This article focuses only upon propositional logic, unless otherwise indicated. Upon turning to the examination of Wittgenstein’s comments, this restriction will be lifted. Propositional logic can be considered as the special case of predicate or first-order logic with all predicate constants as being zero-place in its signature. Propositional logic is bereft of symbolic resources needed for checking many argument forms as valid, for the translation of mathematical statements, and for many other reasons. This article is confined to propositional logic only for the limited purpose of avoiding certain complications while our present interest is in laying out certain basic concepts.

The term Sheffer Functions is sometimes used to refer to two Boolean functions, one of which is interpreted as our Sheffer Stroke and the other is known as NOR (in some interpretations) or Peirce’s Arrow (and also by other names, as will be seen.) Both of these so-called Sheffer functions are binary truth functions (with the names also used for the uninterpreted associated Boolean functions); they both have the remarkable property of being functionally complete—in the sense defined above. The term “Sheffer functions” is also used to refer to functionally complete functions of alternate or non-standard many-valued logics. Some authors who generalize the term “Sheffer function” to many-valued logics define it so that it applied only to unary or binary functions that are, each, functionally complete. Others use the term regardless of the arity of the functionally complete function. A theorem proven by Emile Post in 1921 shows that proven existence of functions of arity n = 1 or n = 2 that define all unary and binary functions of a formal language implies that those functions can also define all functions of higher arities. This result holds regardless of the number of truth values over which the connectives are defined. In the standard two-valued propositional logic, there are no unary connectives that are functionally complete but there are exactly two binary connectives that are, and these are called the Sheffer functions of the standard propositional logic.

The kind of inquiry that reveals the remarkable properties of the Sheffer Stroke is, properly speaking, metalogical or metatheoretical. The two logical connectives, NAND (or the Sheffer Stroke) and NOR, are sometimes referred to summarily as the Sheffer functions. Strictly speaking, those are the associated Boolean functions which are semantically interpreted by the logical connectives. For present purposes, this article does not dwell on this distinction. It speaks consistently about logical connectives. The article investigates the significance of the Sheffer Stroke and NOR in the section on the Properties of the Sheffer Stroke. It also traces the historical background of the discovery of these connective. (see History)

Note that there is inconsistency in the bibliography with respect to both notational variants and terminological jargon. The logician H. M. Sheffer, after whom the connective is named, actually used NOR but Russell-Whitehead used the NAND function when they extolled this discovery in a specially added section to the second edition of their famed Principia Mathematica in the aftermath of what they took to be Sheffer’s discovery. (Whitehead-Russell, 1925, 1927) It was Whitehead-Russell who gave the name “Sheffer Stroke” to the connective. Although the symbol itself had been used by Sheffer to denote NOR, Sheffer rather incongruously called the symbol of his connective “per”, in analogy with the symbol of algebraic division, and he called the connective (now usually called NOR) “rejection.”

As a logical connective, the Sheffer Stroke stands for a Boolean function defined over the set of two values,

    \[2 = \{1, 0\}\]

Insofar as the semantic connective Sheffer Stroke is being examined, think of the values as the truth values True and False and denote them respectively by T and F. It is not unusual to speak interchangeably, or indifferently, of truth functions and logical connectives. Unfortunately, as it was just noted, the bibliography, ranging over several decades in the development of modern logic, is not consistent when it comes to terminological or notational matters. For present purposes, lay down a certain convention: distinguish between logical connectives (also called truth functions) and their underlying or associated Boolean functions. If the symbol of the logical connective is, generally, “*” then symbolize the associated Boolean function by “f_{*}.” In doing so, reserve standard algebraic methods of definition for the associated functions but define the logical connectives by means of the familiar truth table.

The domain D of the associated function of the Sheffer Stroke f_{\mid} is the Cartesian product

    \[\{1, 0\} \times \{1, 0\}\]

the range R of the function is

    \[\{1, 0\}\]

Thus, the associated function of the Sheffer Stroke connective is defined as follows:

    \[f_{\mid}: D = \{1, 0\} \times \{1, 0\} = \{<1, 1>, <1, 0>, <0, 1>, <0, 0> \} \rightarrow R = \{1, 0\}\]

For the sake of completeness, alternative ways of defining this Boolean function will be shown. These, however, should be considered as notational variants; it is the same Boolean function that they all define.

    \[f_{\mid}(1,1) = 0; f_{\mid}(1,0) = 1; f_{\mid}(0,1) = 1; f_{\mid}(0,0) = 1; f_{\mid}(x,y) = 0\]

when

    \[x = y = 1; f_{\mid}(x,y) = 1\]

otherwise

    \[f_{\mid}(x,y) = \{<< 1, 1>, 0> , <<1, 0>, 1>, <<0, 1>, 1>, <<0, 0>, 1>\}\]

It is customary to define logical connectives of logical systems or languages by means of the familiar truth table. The truth table for the connective called the Sheffer Stroke or NAND is given below.

 p  q  p  

    \[\mid\]

 q
 T  T  T  F  T
 T  F  T  T  F
 F  T  F  T  T
 F  F  F  T  F

One can also use the familiar truth table to ascertain that the Sheffer Stroke connective receives the same truth value outputs with the negation of conjunction for all possible assignments of truth values to the individual propositional components. The propositional connective negation, by definition, reverses the truth values of its inputs and the conjunction connective receives the output T only when both of its inputs are T while it receives F for all other possible assignments of truth values to its components. This article symbolizes the negation connective by “\neg” and the conjunction connective by “\wedge”. Because the formulas written in bold are logically equivalent, the formula formed by connecting them with “\leftrightarrow” (symbol of material equivalence) should be a tautology: the truth table verifies this result. (The logical connective of material equivalence is so defined that it receives the output T if and only if its input values are the same truth values.)

p q (p

    \[\pmb{\mid}\]

q)

    \[\leftrightarrow\]

    \[\pmb{\neg}\]

(p

    \[\pmb{\wedge}\]

q)
T T T F T T F T T T
T F T T F T T T F F
F T F T T T T F F T
F F F T F T T F F F

 

The linguistic expression whose logical behavior is presumed modeled by this logical connective is the truth-functional phrase “not both,” from which the name NAND originates. This expression is logically equivalent (it yields the same truth value for the same assignments of truth values to its components) with “either not the first or not the second” for two component propositions; hence the alternative name of this connective as Alternate Denial. Making the claim that the Sheffer Stroke connective models such expressions of language means that what is modeled is taken to be truth-functional expressions of a natural language like English. Truth-functionality means that the compound proposition always takes a truth value (true or false) that can be uniquely determined when the truth values of the components or parts are known; this is because the special logical particle (in this case “not both”) that connects the component propositions is definable in terms of its truth conditions (what truth value it yields for specified assignments of truth values to the component propositions it connects). Insofar as one is dealing with truth-functional expressions, the Principle of Compositionality of Meaning applies: the logical meaning of the composite depends uniquely on the specified logical meanings of its parts. For non-truth-functional meanings of “not” or “and,” the expression “not both” is not truth-functional and cannot be modeled by the connective called Sheffer Stroke or NAND.

Connecting two propositions by means of the linguistic particle modeled by the Sheffer Stroke asserts the claim that these two propositions are mutual contraries. One can appreciate what contrariety means by checking the truth table above, by means of which the Sheffer Stroke connective is defined: the compound in which the Sheffer Stroke symbol is the principal-connective symbol is false only when the connected propositional components are both true; it is true in every other case (or model, which means assignment of truth values to the propositional components or, also called, valuation.) Contrariety (or mutual contrariety), then, means that the propositions that are presumed contraries cannot possibly be true together but they can possibly be false together. One should distinguish this from the relationship known as mutual contradictoriness: two propositions are mutual contradictories if and only if they cannot possibly be true together and they cannot possibly be false together. If two propositions p and q are mutual contradictories, then the compound proposition formed by connecting them by means of the exclusive either-or is a logical truth. On the other hand, based on what has been said, and as can be seen by the truth table above, one has: when two propositions are mutual contraries, then the proposition formed by connecting them by means of the Sheffer Stroke connective is a logical truth.

a. Alternative Definitions of the Sheffer Stroke

There are other ways of defining the Sheffer Stroke connective. Its matrix definition is as follows:

    \[\pmb{p \mid q}\]

T F
T F T
F T T

The Disjunctive Normal Form (DNF) of

    \[\ulcorner p \mid q\urcorner\]

is

    \[\ulcorner \neg p \vee \neg q\urcorner\]

(Corner brackets are used because there is reference to symbols of the formal object language within the metalanguage, which is a symbolically enhanced fragment of English used to talk about the formal language. Notice that symbols like “\varphi”, on the other hand, are themselves metalinguistic and do not take corner brackets. No such brackets are needed also in the case in which the formulas are presented by themselves in space reserved for them.)

The DNF of a well-formed formula \varphi can be obtained from the truth table of \varphi by means of the following method: Check the rows, and only the rows, across which \varphi receives the truth value T. If an individual (or atomic) variable receives T on that row, reproduce it as it is, \ulcorner p \urcorner; if the individual variable receives F on that row, reproduce it as negated. \ulcorner \neg p \urcorner. Next, form the conjunction of the propositional variables so represented (which means that one connects them by the connective symbol \ulcorner \wedge \urcorner.) Do this for all rows on which \varphi receives T. Finally, join all the conjunctions formed in this manner by means of inclusive disjunctions, symbolized by \ulcorner \vee \urcorner.

Thus, examining the truth table by means of which the Sheffer Stroke was defined, one has: the value T is received on the rows for values of the single propositional variables:

    \[<p^{T}, q^{F}>, <p^{F}, q^{T}>, <p^{F}, q^{F}>\]

 

Form the conjunctions first:

    \[p \wedge \neg q, \neg p \wedge q, \neg p \wedge \neg q\]

 

Then, form their conjunction:

    \[(p \wedge \neg q) \vee (\neg p \wedge q) \vee (\neg p \wedge \neg q)\]

 

This expression admits of further simplification (a subject that is beyond current concerns), to yield a logically equivalent formula:

    \[\neg p \vee \neg q\]

 

A method of representation known as the Karnaugh Map is as follows for the Sheffer Stroke. Two different variants of this method are explored. This is essentially diagrammatic as it allows for simplifications of well-formed formulas that are first transformed into their equivalent normal forms before they are mapped by this type of diagram. The normal form for the Shefer Stroke is:

    \[(p \mid q) \leftrightarrow (\neg p \vee \neg q)\]

 

The expression to the right is in both Disjunctive and Conjunctive Normal Form. It has exactly two literals, \ulcorner \neg p\urcorner and \ulcorner \neg q\urcorner. Taken as a Disjunctive Normal Form, it has as literals the negations of the two propositional variables: accordingly, we enter into the Karnaugh Map the values T and F in a way we will present now briefly. (Usually, this kind of diagram takes the values as uninterpreted or numerical,

    \[\{1, 0\}\]

but we can disregard this.) To enter the proper values, we follow the entire row or entire column along which the variable receives the truth value True as shown below. The remaining blocks receive F.

    \[\pmb{p \mid q}\]

    \[\pmb{q}\]

    \[\pmb{\neg q}\]

    \[\pmb{p}\]

F T

    \[\pmb{\neg p}\]

T T

An alternative version (actually corresponding more closely to the initial design of this diagrammatic method) is as follows:

    \[\pmb{p \mid q}\]

    \[\pmb{T}\]

    \[\pmb{F}\]

    \[\pmb{T}\]

F T

    \[\pmb{F}\]

T T

In older texts, we find definitions of connectives like the following definition of the Sheffer Stroke. We consider the propositional variables to be taking truth values in the order:

    \[<TT, TF, FT, FF>\]

This method of definition is found, along with the truth-tabular definition, in Wittgenstein’s Tractatus.

    \[p \mid q \stackrel{\text{def}}{=} (FTTT) (p,q)\]

In textbooks like the one written by Arthur Prior (1962, pp. 5-21) the definition would be given as follows:

    \[T \mid T = F ; T \mid F = T ; F \mid T = T ; F \mid F = T\]

Because Prior uses the Polish notation (see section 1c below), he defines the Sheffer Stroke and Peirce Arrow, symbolized respectively by “D” and “X”, as follows—with “N” symbolizing negation, “A” symbolizing inclusive disjunction, “K” symbolizing conjunction, while prefix notation is used throughout:

    \[Dpq \stackrel{\text{def}}{=} NKpq = ANpNq\]

    \[Xpq \stackrel{\text{def}}{=} NApq = KNpNq\]

Another way of defining the Sheffer Stroke and Peirce Arrow is given (Prior, 1962, p. 12), reading the output values from left to right inside the parenthesis as corresponding to value assignments for the atomic components as

    \[<1, 1>, <1, 0>, <0, 1>, <0, 0>\]

    \[Dpq: (0, 1, 1, 1)pq\]

    \[Xpq: (0, 0, 0, 1)pq\]

In the set-theoretic interpretation of Boolean functions, the operation that corresponds to the Sheffer Stroke or NAND is complementation of intersection of sets. Clearly, complementation (symbolized by “'” is the set-theoretic analogue of negation and intersection (symbolized by “\cap”) is the set-theoretic analogue of conjunction. The symbol “\in” stands for set membership.

(A \cap B)' = \{x: it is not the case that both x \in A and x \in B\}

A Venn diagram can be drawn of the operation.

General Venn Diagram Regions
A: 1 and 2
B: 2 and 3
A \cap B: 2
A': 3 and 4
B': 1 and 4
A' \cap B': 4
NAND: (A \cap B)': 1 and 3 and 4

 

null

NAND Venn Diagram (yellow area)

null

Boolean functions can be represented as operations in an algebra,

    \[\mathscr{B} = <\{1, 0\}, \{\times, + y\}, 1>\]

with carrier set \{1, 0\} and adequately equipped with a set of operations of multiplication and addition-modulo-2 along with the constant or zero-ary function 1. The definitions of the operations over the carrier set’s values are:

    \[1 \times 1 = 1, 1 \times 0 = 0, 0 \times 1 = 0, 0 \times 0 = 0\]

    \[1 + 1 = 0 + 0 = 0, 1 + 0 = 0 + 1 = 1\]

The Sheffer Stroke and Peirce Arrow are definable in this algebra as:

    \[f_{\mid}(x, y) = (x \times y) + 1\]

    \[f_{\downarrow}(x, y) = (x \times y) + x + y + 1\]

The multiplication sign is omitted as is conventional in standard notations. So, we have:

    \[f_{\mid}(x, y) = xy + 1\]

    \[f_{\downarrow}(x, y) = xy + x + y + 1\]

Considering that the general form for binary polynomials representing functions is

    \[f^{*}(x, y) = \alpha xy + \beta x + \gamma y + \delta\]

the coefficients are

    \[\alpha = 1, \beta = 0, \gamma = 0, \delta = 1\]

The general form can also be represented as follows and, by having recourse to the familiar semantic truth table, we can determine the values of the coefficients which are, in this representation form, the values of the function for the shown pairs (i.e., f_{\mid}(1, 1) = 0, f_{\mid}(1, 0) = 1, f_{\mid}(0, 1)) = 1,f_{\mid}(0, 0) = 1).

    \[f_{\mid}(x, y) = f_{\mid}(1, 1)xy + f_{\mid}(1, 0)x(1 + y) + f_{\mid}(0, 1)(1 + x)y + f_{\mid}(0, 0)(1 + x)(1 + y)\]

By carrying out the operations in the algebra, we obtain the expected result, keeping in mind that (2 = 0) (modulo 2):

    \begin{multline*}$$f_{\mid}(1, 1)xy + f_{\mid}(1, 0)x(1 + y) + f_{\mid}(0, 1)(1 + x)y + f_{\mid}(0, 0)(1 + x)(1 + y) = 0xy + 1x(1 + y) + 1(1 + x)y + 1(1 + x)(1 + y) = \\ x + xy + y + xy + 1 + x + y + xy = 2x + 2y + 2xy + xy + 1 = xy + 1$$ \end{multline*}

b. Decision-Procedural Rules for the Sheffer Stroke

The decision procedure in propositional logic known as the Tree Method, can incorporate rules for “\mid” as follows:

In the Beth-Tableau Method, the rules for “\mid” should be represented as follows:

It is possible to develop a Gentzen-sequent rule for the Sheffer Stroke. (See Riser, 1967; Béziau, 2001; for a more detailed analysis, Read, 1999.) The theoretical significance of enacting proof-theoretic procedures, like Gentzen’s, consists in that the connectives are then defined by means of the rules for their introductions and/or eliminations; there is a substantive philosophic view that this is the proper approach to assessing the meanings of logical connectives. In Gentzen-style sequents, the variables to the left of the turnstile symbol (“\vdash”) are presumed joined by conjunction and those to the right are presumed joined by inclusive disjunction. A variable may be shifted from left to right or from right to left by being negated. Repeated variable letters may be deleted (by means of a rule known as Contraction) and variable letters may be shifted freely (or permuted) insofar as they stay in the same side of the turnstile.

c. Alternative Symbols

Another symbol for the Sheffer Stroke or NAND connective is “\uparrow” and this symbol is, appropriately, called the “Sheffer Dagger” or “Sheffer Upward Arrow.” An older symbol is “\veebar” (for instance, in Alonzo Church’s influential text on Mathematical Logic, Introduction to Mathematical Logic, p. 37), but this symbol is now more commonly used in certain notational variants to symbolize exclusive disjunction. (See History below for symbols used by Sheffer himself and by C. S. Peirce.)

In Polish notation, which uses not infix but prefix placement for connective symbols and neatly dispenses with parentheses, the symbolization for NAND is:

    \[Dpq\]

To write in Polish notation that material equivalence (symbolized by “E”) obtains between NAND and the negation (symbolized by “N”) of conjunction (symbolized by “K”), we write:

    \[EDpqNKpq\]

The symbolic variant used for logical gates in electronic circuitry also deploys prefix notation (with the symbol of the function written before and not in between the input variables. Thus,

    \[NAND (A, B)\]

As the case usually is with writing out functions, it should be noted that there is ambiguity surrounding the notation used for representing the Boolean function interpreting the Sheffer Stroke: it is not clear if it is the operation that is represented or if a name of the function is given. The notation of the so-called lambda-calculus (or \lambda-calculus) can be used to disambiguate. Accordingly, to indicate unambiguously that we are giving the name of the underlying function of the NAND (or Sheffer Stroke) connective, we can write:

\lambda x.\lambda y (f_{\mid}(x, y)) (—) (___)

with possible specification of the underlined input variables from the set \{1, 0\}

2. History

The logical connective we call the Sheffer Stroke and its symbol are named after Henry Maurice Sheffer who, in 1913, published a paper in which he introduced a connective (called a “primitive idea” in the jargon of the times) with remarkable logical properties. Sheffer’s project was motivated by the purpose of using this connective to provide a more parsimonious or economical rendering of Huntington’s axiom system for standard propositional logic. In the parlance of the times, the purpose was to “reduce” the number of “primitive” connectives of standard propositional logic. We will see in subsequent section what all this amounts to.

It so happens that Sheffer used another logical connective which, like the Sheffer Stroke, allows for a reduction of the number of logical connectives that are used. This connective is usually called NOR, Peirce’s Arrow or Joint Denial. The name Sheffer’s Stroke was coined by the authors of Principia Mathematica (Whitehead-Russell, 1963) who extolled the significance of the discovery of this connective and proceeded to add an entire section to the 2^{nd} edition of the Principia utilizing the connective. We will be able to fully appreciate the claims made about the significance of this discovery after we have studied the section on the Properties of the Sheffer Stroke. An entire section, Significance of the Sheffer Stroke for Mathematical Logic, Philosophical Logic, and Philosophy, will be devoted to assessing the importance of this connective.

Sheffer himself had called his connective “rejection,” inspired by the correspondence of this connective to the linguistic expression “neither-nor.” Another name that was once in usage for this connective is “dispersion.” As we have mentioned, this connective is usually called NOR or Peirce’s Arrow today. Sheffer called the propositional variables that are the connective’s related variables or inputs “rejects.” Rather inopportunely, he gave to the connective symbol the name “per” in analogy to the name of the symbol of the standard algebraic division: in terms of the underlying algebra of modern propositional logic, however, there is no satisfactory Boolean analogue to algebraic division and, so, the name “per” is misleading.

a. Peirce’s Discovery

It turns out that the American logician and philosopher Charles Sanders Peirce (1839-1914) had already discovered the logical connective we call the Sheffer Stroke, as well as the related connective NOR (also called Joint Denial, and quite appropriately Peirce’s Arrow, with other names in use being Quine’s Arrow or Quine’s Dagger and today usually symbolized by “\downarrow”). The relevant manuscript, dating to 1880, numbered MS 378 in a subsequent edition and titled “A Boolian [sic] Algebra with One Constant” (Peirce, 1971), was actually destined for discarding and was salvaged for posterity literally at the nick of time in 1926. A fragmentary text by Peirce dating from 1880 also shows familiarity with the remarkable metalogical characteristics that make a single function functionally complete, and this is also the case with Peirce’s unfinished Minute Logic (1902, ch. 3): these texts were eventually published posthumously (1933, vol. 4, pp. 13-18, 215-216.)

Peirce designated the two truth functions, NAND and NOR, by using the symbol “\curlywedge” which he called Ampheck, coining this neologism from the Greek word ἀμφήκης which means “of equal length in both directions.” (Peirce, 1933: 4.264) Peirce’s editors disambiguated the use of symbols by assigning “\overline{\curlywedge}” to the connective we call the Sheffer Stroke while preserving the symbol “\curlywedge” for NOR.

(More about Peirce’s work in logic, including reference to the 1880 manuscript, can be found in another encyclopedia article.)

Like Sheffer did later, Peirce understood that these two connectives can be used to “reduce” all mathematically definable connectives (also called “primitives” and “constants”) of propositional logic: this means that all definable connectives of propositional logic can be defined by using only the Sheffer Stroke or NOR as the single connective. No other connective (or associated function) that takes one or two variables as inputs has this property. Standard, two-valued propositional logic has no unary functions that have the property of functional completeness. In subsequent section, we will explore this remarkable logical property in detail. At first blush, availability of this option ensures that economy of resources can be obtained—at least in terms of how many functions or connectives are to be included as undefined. Unfortunately, there is a trade-off between this gain in economy of symbolic resources and the unwieldy length and rather counterintuitive appearance of the formulas that use only the one connective.

It is characteristic of Peirce’s logical genius and emblematic of his rather under-appreciated contributions to the development of modern logic that he grasped the significance of functional completeness and figured out what truth functions—up to arity 2—are functionally complete for two-valued propositional logic. (Strictly speaking, this is the property of weak functional completeness, given that we disregard whether constants or zero-ary functions like 1 or 0 can be defined.) Peirce subscribed to a Semeiotic view, according to which the fundamental nature and proper tasks of the formal study of logic are defined by the rules set down for the construction and manipulation of symbolic resources. A proliferation of symbols for the various connectives that are admitted into the signature of a logical system suffers from a serious defect on this view: the symbolic grammar fails to match or represent the logical fact of interdefinability of the connectives. Peirce was willing sometimes to accept constructing a formal signature for two-valued propositional logic by using the two-members set of connectives \{\neg , \bot \}, which is minimally functionally complete. This means that these two connectives—or, if we are to stick to an approach that emphasizes the notational character of logical analysis, these two symbols—are adequate expressively: every mathematically definable connective of the logic can be defined by using only these two; and the set is minimally functionally complete in the sense that neither of these connectives can be defined by the other (so, as we say, they are both independent relative to each other.) The symbol \ulcorner \bot \urcorner can be viewed as representing a constant truth function (either unary or binary) that returns the truth value False for any input or inputs. Or it can be regarded as a constant, which means that it is a zeroary (zero-input) function, a degenerate function, which refers to the truth value False. Although not using our contemporary terminology, Peirce took the second option. This set has cardinality 2 (it has exactly two members) but it is not the best we can do. Peirce’s discovery of what we have called the Sheffer Functions (anachronistically and unfairly to Peirce, but bowing to convention) shows that we can have a set of cardinality 1 (a one-member set or a so-called singleton) that is minimally functionally complete with respect to the definable connectives of two-valued propositional logic. Thus, either one of the following sets can do. The sets are functionally complete and, because they have only one member each, we say that the connectives themselves have the property of functional completeness. \ulcorner \mid \urcorner is the symbol of the Sheffer Stroke or NAND and \ulcorner \downarrow \urcorner is the symbol of the Peirce Arrow or NOR. (We stipulate as such, even though we have not introduced our grammar formally.)

It is important to show, albeit briefly, how these functions can define other functions. Algebraically approached, this is a matter of functional composition but we do not enter into such details here. We will have more details in subsequent sections. In case one wonders why the satisfaction with defining the connectives of the set that comprises the symbols for negation, inclusive disjunction, and conjunction, namely \{ \neg, \vee, \wedge \}, there is an explanation: there is an easy, although informal, way to show that this set is functionally complete. It is not minimally functionally complete because \ulcorner \vee \urcorner and \ulcorner \wedge \urcorner are inter-definable. But it is functionally complete. Thus, showing that one can define these functions suffices for achieving functional completeness. Definability should be thought as logical equivalence: one connective can be defined by means of others if and only if the formulas in the definition (what is defined and what is doing the defining) are logically equivalent. (Presuppose the truth-tabular definitions of the connectives.)

    \[\neg p \stackrel{\text{def}}{=} (p \mid p)\]

    \[(p \vee q) \stackrel{\text{def}}{=} ((p \mid p) \mid (q \mid q))\]

    \[(p \wedge q) \stackrel{\text{def}}{=} ((p \mid q) \mid (p \mid q))\]

    \[\neg p \stackrel{\text{def}}{=} (p \downarrow p)\]

    \[(p \vee q) \stackrel{\text{def}}{=} ((p \downarrow p) \downarrow (q \downarrow q))\]

    \[(p \wedge q) \stackrel{\text{def}}{=} ((p \downarrow q) \downarrow (p \downarrow q))\]

 

b. The “Discovery” and Principia Mathematica

Bertrand Russell hailed this development (which he considered to be Sheffer’s “discovery”) and, with the co-author of Principia Mathematica Alfred Whitehead, added an entire section in the 2^{nd} edition to take advantage of the discovery. Russell was not aware that Peirce had already made the discovery in the 19^{th} century. Prompted by this applause and urged on by the weight of renewed expectations, Sheffer, who was not a prolific author, returned to the task of taking further advantage of his discovery, but he did not succeed in advancing beyond his initial contribution.

Not only did Russell hail this discovery, but also the oracular thinker and profoundly influential philosopher Ludwig Wittgenstein (Tractatus, 1922) used grandiloquent language in celebrating the discovery that the “ideal” formal language of standard logic can be “reduced” to a single “primitive.” What this all means is discussed in the section on the Significance of the Sheffer Stroke for Mathematical Logic, Philosophical Logic, and Philosophy. Two other influential authors of an early logic textbook, David Hilbert and Wilhelm Ackermann, regarded this development as a rather unimpressive detail.

Despite the hullabaloo about the “single primitive,” efforts to take advantage of this result for constructing economical versions of Predicate Logic were sparse. No doubt, one reason is that a system that would have only the Sheffer Stroke as its connective would require use of unwieldy formulaic expressions. Abbreviation conventions would be needed, at a minimum. Another reason for the neglect, at least in Quine’s case, was that he was similarly preoccupied with generating other, similarly parsimonious, notational variants of logic (including variable-free grammars.) It was Moses Schönfinkel, one of the originators of Combinatory Logic, who adopted the Sheffer Stroke as single connective to construct a notational idiom of predicate logic. (See Bimbó, 2010.)

3. The Logical Connectives of Standard Propositional Logic and the Sheffer Stroke

It is time to briefly introduce a notational variant or idiom of standard propositional logic (SPL), within which one can locate the truth function NAND (or Sheffer’s Stroke); by referring to this formal language, one can examine and explicate the properties and significance of the Sheffer Stroke. Because one wants to be able to refer to other logical connectives besides the Sheffer Stroke, one actually lays out an expanded variant of SPL, which is called here SPLexp. Talk about SPLexp is within a fragment of English; this fragment is enhanced with specially designated symbols and, as such, it serves as our Metalanguage (ML) while SPLexp is the Object Language (OL). The next goal is to obtain ML symbols from the OL, and this is done without danger of ambiguity because the context makes clear whether OL or ML is employed. As is customary, when symbols are mentioned rather than used, they are placed within quotation marks.

The formal language SPLexp has symbolic resources for single or atomic propositional variables (up to the infinity of the natural numbers), and for logical connectives. It also has auxiliary symbols, and parentheses to be used only for the sake of preventing ambiguity of well-formed expressions. The metalinguistic symbol “\in” means “___ is a member of set —”. For connectives, the expansive idiom includes symbols for all definable unary and binary connectives of the standard propositional logic. For present purposes, there is no need to supply names for all the definable connectives denoted by these symbols. Definitions of the connectives are given by means of the familiar truth table. In brief,

PROPOSITIONAL VARIABLES = \{p, q, r, \ldots, p_{i}, \ldots, q_{i}, \ldots\}, i \in N
CONNECTIVE SYMBOLS = \{ \top^{1}, \bot^{1}, id, \neg, \top^{2}, \bot^{2}, 1, 2, \neg 1, \neg 2, \vee, \wedge, \rightarrow, \leftarrow, \leftrightarrow, \nrightarrow, \nleftarrow, \nleftrightarrow, \downarrow, \mid \}

 

Standard grammatical conventions for the construction of well-formed formulas are used.

N is the set of natural numbers. “\mid \varphi \mid” denotes the truth value of a well-formed formula \varphi. Symbols from the object language are appropriated, trusting that the context removes ambiguity.

There are 2^{2} = 4 unary connectives, and there are 2^{2} raised to the second power = 16 binary connectives that are mathematically definable in the standard (two-valued) propositional logic. (In general, if n is the number of inputs to the connective, the number of mathematically definable n-ary connectives in standard propositional logic is 2^{2} raised to the n^{th} power.)

Some characteristic equivalences, which can be checked by the familiar truth table method, are:

    \[(p \mid q) \leftrightarrow \neg (p \wedge q)\]

    \[(p \mid q) \leftrightarrow (\neg p \vee \neg q)\]

    \[(p \mid q) \leftrightarrow (p \rightarrow \neg q)\]

    \[(p \mid q) \leftrightarrow (q \rightarrow \neg p)\]

    \[(p \mid q) \leftrightarrow \neg (\neg p \downarrow \neg q)\]

    \[(p \downarrow q) \leftrightarrow \neg (p \vee q)\]

    \[(p \downarrow q) \leftrightarrow \neg (\neg p \mid \neg q)\]

 

4. Properties of the Sheffer Stroke

An examination of the properties of the Sheffer Stroke begins after having the formal idiom SPLexp in place. Introductory logic textbooks usually omit references to the special properties of the Sheffer Stroke; more advanced logic texts and mathematical logic or metalogic texts always make special mention of this connective and of its dual, the NOR or Peirce Arrow connective. (What “duality” means in this context will be examined soon.)

The student of logic learns that the Sheffer Stroke or NAND, like NOR, has a remarkable characteristic that is called functional completeness or expressive completeness. No other unary or binary connective, besides the Sheffer Stroke and its dual NOR, has this property. No connective of lesser arity (thus, zeroary or unary) has this property, either. When alternative logics are investigated, a fundamental metalogical task consists in querying the existence of functionally complete functions, which may be called Sheffer Functions. Present observations are limited to what is known as standard (sometimes called classical) logic: when it comes to alternative or non-classical logics, the connective defined as negation of conjunction should not be presumed to have the property of functional completeness. (It should be borne in mind that negation and conjunction themselves have different, non-standard, meanings in alternative logics since they are defined over more than the two truth values of standard logic.)

After defining functional completeness, it will be shown that indeed the Sheffer Stroke (or NAND) possesses this remarkable property. One needs to ask also why this is the case and why this is an important characteristic.

This property, functional or expressive completeness, is not to be confused with what is called simply “completeness.” Completeness in that sense means this: relative to what are the logical truths of a formal language \mathcal{L}, whose logical consequence relation is symbolized as “\Vdash_{\mathcal{L}}”, a proof system L is complete if and only if L’s derivability relation, symbolized “\vdash_{L}”, is such that:

\Vdash_{\mathcal{L}} if and only if \vdash_{L}

 

This is equivalent to:

not-\Vdash_{\mathcal{L}} if and only if not-\vdash_{L}

 

Roughly, what this means is that a complete system, and only a complete system, will have failures of proof or failures of derivation in all the cases, and only in the cases, in which one expects the corresponding semantical language to be failing in establishing semantic conclusions or logical truths. Think of a logical truth as a semantical conclusion of any, including the empty, set of premises.

There is more about this fundamental topic of Metalogic in other articles (see Propositional Logic and references there), but caution is needed here to note that functional completeness is not related to that other concept called simply “completeness.”

A logical connective f of a formal language \mathcal{L} is functionally complete with respect to \mathcal{L} if and only if every mathematically definable logical connective f_{j} of \mathcal{L} can be defined in terms only of f.

For the case of a binary connective f_{j}^{2}(p, q) that is functionally complete, all mathematically definable connectives f_{i}^{n}(p_{1}, p_{2}, \ldots, p_{n}) can be defined by using only propositional variables and the connective f_{j}^{2}(p, q).

If one wants to define functional completeness in terms of the familiar semantic device of the truth table, one can do so in the following way:

a connective f is functionally complete for the language of standard propositional logic if and only if the truth tables for all mathematically definable connectives (of any arity) can be constructed with labels on the top arrow having the symbol for f as the only connective symbol.

For example, this can be done by using only the Sheffer Stroke symbol, \ulcorner \mid \urcorner, for certain familiar connectives of the standard propositional logic. Two truth tables are considered identical if they agree on all the truth value outputs corresponding to the same valuations (truth value assignments as inputs.) Thus, the truth-tabular definition of \ulcorner \wedge \urcorner coincides with the truth table with outputs as in “\wedge/\mid” and the truth-tabular definition of \ulcorner \vee \urcorner coincides with the truth table whose output column is as in “\vee/\mid”. Notice that the labels for \wedge/\mid and \vee/\mid use propositional variables and the only connective symbol they use is of \ulcorner \mid \urcorner.

An alternative and equivalent definition is:

a connective f is functionally complete for the language of standard propositional logic if and only if for every truth table labeled by a well-formed formula of the logic there is an identical truth table whose label has the Sheffer Stroke symbol as the only connective symbol.

(Two truth tables are identical if they agree on every truth value output corresponding to the same truth value input assignments.)

The non-trivial question now faced is whether such functionally complete connectives are definable and how high one has to ascend in arity (to unary, binary, and so on) before finding a functionally complete connective. The answer is that one can stop at the level of binary connectives in the case of the standard two-valued logic: the Sheffer functions (the Sheffer Stroke and NOR) are, each, functionally complete. The details are examined below.

One can also define functional completeness as a property of groups or sets of connectives. Sometimes one finds references to functional completeness as a property of systems of connectives.

A set X = \{f_{1}, \ldots, f_{n}\} of connectives is functionally complete with respect to a formal language \mathcal{L} if and only if every mathematically definable connective f of \mathcal{L} can be defined by using only members of X.

Such a set X is then itself a member of the set of functionally complete sets of the language, FC(\mathcal{L}). Thus, using “\in” as the symbol for set membership,

    \[\{\mid\} \in FC(\mathcal{L})\]

This means that the one-member set (singleton set) with the Sheffer Stroke as its only member is a functionally complete set or is a member of the set of functionally complete sets.

Of special interest is a proper subset of the functionally complete sets of connectives FC(\mathcal{L}): sets that are minimally or non-redundantly functionally complete, MFC(\mathcal{L}). Here is what this means.

A set X = \{f_{1}, \ldots, f_{n}\} of logical connectives is minimally functionally complete (MFC) if and only if it is functionally complete (FC) and also it is the case that no connective in the set can be defined by using other connectives in the set.

If so, then each connective in the set is independent of the other connectives or simply independent.

Now, consider the set that is comprised only of the Sheffer Stroke connective:

    \[X_{\mid} = \{\mid\}\]

This set is functionally complete. Since it has only one member, this set has to be MFC (minimally functionally complete) if it is FC (functionally complete). This is because there is only one connective; so, it is impossible for it to be defined in terms of other connectives in the set: this connective has to be independent! There are exactly two such MFC singletons in standard propositional logic (up to binary connectives):

    \[X_{\mid} = \{\mid\} \in MFC(\mathcal{L})\]

    \[X_{\downarrow} = \{\downarrow\} \in MFC(\mathcal{L})\]

This section concludes by highlighting certain other properties possessed by the Sheffer Stroke connective. This is done because having those properties is the underlying reason that the Sheffer Stroke connective is functionally complete. This brief investigation of those properties and of how they are related to functional completeness encapsulates the results established by the mathematicians Emil Post (1921, 1941) and William Wernick (1942).

1. Before examining the relationship between functional completeness and certain other logical properties of the Sheffer Stroke, there is a straightforward way of establishing that

    \[\{\mid \}\]

is FC (functionally complete). To do this, consider how one can use the truth-tabular definition of any connective to extract its Disjunctive Normal Form (DNF). Here is an example. The same can be done with any connective regardless of its arity. This example is of a ternary or three-place connective. Extra columns are added to the truth table for illustrative purposes. In these columns the truth values of the individual propositional variables are traced across the rows in which the connective receives T as its truth value; then, it is shown how the DNF is formed.

Consider the Disjunctive Normal Form (DNF) of the given ternary connective. The method is this: form conjunctions of the atomic propositional variables across each row in which the connective receives the truth value T; the variables are written as seen in the added row of the example. Then form the inclusive disjunction of the conjunctions constructed in the previous step. Based on this truth table and the DNF that can be obtained, the definition of this connective in DNF could be given as follows:

    \[f_{\#}(p, q, r) \stackrel{\text{def}}{=} (p \wedge q \wedge \neg r) \vee (p \wedge \neg q \wedge r) \vee (\neg p \wedge \neg q \wedge r)\]

 

A note about formal grammar: One takes advantage of the associativity of conjunction, inclusive disjunction and equivalence to omit unnecessary parentheses without ambiguity: when a connective \# is associative, “(\varphi \# \chi) \# \psi” and “\varphi \# (\chi \# \psi)” are equivalent; hence, they can both be written as “\varphi \# \chi \# \psi”. Note omission of outer parentheses.

Because the above can be done for any connective, one can conclude that the set

    \[X_{1} = \{\neg, \vee, \wedge \}\]

is FC (functionally complete): any mathematically definable connective of the propositional language can be defined by using only connectives from the set X_{1}. This includes the unary connectives. First, notice that negation is included in the set X_{1}. The other three unary connectives are also definable as shown below.

    \[id p \stackrel{\text{def}}{=} p; T^{1} p \stackrel{\text{def}}{=} p \vee \neg p; \bot^{1} \stackrel{\text{def}}{=} (p \wedge \neg p)\]

 

And when it comes to connectives of arity \leq 2, the truth table shows the way to define them by using only connectives from the set X_{1}, as above.

The set

    \[X_{1} = \{\neg, \vee, \wedge \}\]

is functionally complete, as just established, but it is not minimally functionally complete. There is redundancy in it because the connectives conjunction and inclusive disjunction are inter-definable as can be seen in light of the following so-called DeMorgan equivalences:

    \[(p \vee q) \leftrightarrow \neg (\neg p \wedge \neg q))\]

    \[(p \wedge q) \leftrightarrow \neg (\neg p \vee \neg q)\]

 

The following two sets are not only FC (functionally complete) but also MFC (minimally functionally complete):

    \[X_{2} = \{\neg, \vee \}\]

    \[X_{2} = \{\neg, \wedge \}\]

 

Now consider the Sheffer Stroke. In order to show that the set X_{\mid} = \{\mid \} is FC, show that negation and either conjunction or inclusive disjunction are definable in terms of the Sheffer Stroke. Because \{\neg, \vee \} and \{\neg, \wedge \} are, each, functionally complete, if the symbolized connectives in any one of these sets are definable in terms of the connective in \{\mid \}, then this latter set also must be functionally complete.

It can be shown that, indeed, negation and inclusive disjunction, as well as conjunction, are definable in terms of the Sheffer Stroke. The truth table method can be used to verify that the following equivalences indeed obtain:

    \[\neg p \leftrightarrow (p \mid p)\]

    \[(p \vee q) \leftrightarrow ((p \mid p) \mid (q \mid q))\]

    \[(p \wedge q) \leftrightarrow ((p \mid q) \mid (p \mid q))\]

These equivalences can be justified in another way. Taking advantage of certain valid equivalences of the standard propositional logic, which are used to make replacements of phrases by their equivalents without alteration to truth value, one has:

    \[\neg p \leftrightarrow \neg (p \wedge p) \leftrightarrow (p \mid p)\]

    \[(p \vee q) \leftrightarrow \neg (\neg p \wedge \neg q) \leftrightarrow (\neg p \mid \neg q) \leftrightarrow ((p \mid p) \mid (q \mid q))\]

    \[(p \wedge q) \leftrightarrow \neg \neg (p \wedge q) \leftrightarrow \neg (p \mid q) \leftrightarrow ((p \mid q) \mid (p \mid q))\]

2. Emil Post (1921, 1941; see also Pelletier and Martin, 1990) showed that any set X of definable logical connectives of the standard propositional logic is functionally complete if and only if X is not a subset of any of the following sets of connectives:

  1. the set of monotonic connectives (MC(\mathcal{L}));
  2. the set of linear (also called countable, counting, or affine) connectives (L(\mathcal{L}));
  3. the set of self-dual connectives (SD(\mathcal{L}));
  4. the set of truth-preserving connectives (TP(\mathcal{L}));
  5. and the set of falsehood- (or falsity-) preserving connectives (FP(\mathcal{L})).

If one single logical connective f is to be functionally complete by itself (or if the singleton set with the function symbolized by f as its only member is functionally complete), then the function f must lack all of the above properties of connectives. In other words,

  1. f should not be monotonic;
  2. f should not be linear;
  3. f should not be self-dual;
  4. f should not be truth-preserving;
  5. f should not be falsehood-preserving.

After briefly defining these interesting properties, it can be shown that, among definable binary connectives, only the Sheffer functions (the Sheffer Stroke or NAND and the Peirce Arrow or NOR) lack all of those properties when one considers all the definable unary and binary connectives of the standard propositional logic. If one is examining a set of functions to determine whether it is functionally complete, check that there is at least one function in the set, which lacks one of the above properties; and perform this check for each property. So, one needs to ensure that, for each of the properties above, there is at least one function that lacks this property.

One can engage briefly the deeper analysis behind this seminal result, which can be called the Post Result (while the test presented above can be called the Post Test): All the enumerated properties are so-called Hereditary Properties. This means that if a function f (or corresponding semantic connective) has a property like this, then all functions that can be defined by using only f must also have this property. This means that every such hereditary property \mathbb{P} is “inherited” necessarily by all functions that are defined by means only of the function f that has \mathbb{P}. But these hereditary properties are not characteristic of all definable functions. In other words, there are definable functions that lack \mathbb{P}, for each hereditary property \mathbb{P}. This explains Post’s result. A function that can indeed define, just by itself, all definable functions should not have any one of the hereditary properties because, if it had any such property, it would necessarily transmit it to every function it defines; but, then, the function could not define functions that lack this property.

Monotonicity:

Consider the case of binary functions, given the present inquiry. Note that these definitions of properties apply for n-ary functions in general. Note also that the two truth values

    \[(2 = \{T, F\})\]

are ordered so that the truth value denoted by “F” is lower than that denoted by “T”. The table makes this point:

This is called partial ordering and, as a relation, it can be defined set-theoretically as:

    \[\{<F, F>, <F, T>, <T, T>\}\]

 

A binary function

    \[f(x, y)\]

is monotonic if and only if, for all input values x_{1}, x_{2}, y_{1}, y_{2}:

If x_{1} \leq x_{2} and y_{1} \leq y_{2}, then f(x_{1}, y_{1}) \leq f(x_{2}, y_{2})

 

What does this mean in our case of binary logical connectives? There is a test, which follows from this definition, for determining whether a given binary connective is monotonic or not. The test goes like this. Start by writing out the input pairs for truth values (<T, T>, <T, F>, <F, T>, <F, F>) by using a diagram as shown below. This diagram arranges the pairs of truth values so that the arrows show the ordering just talked about. Next, write as a superscript for each pair of input values the truth value taken by the connective for that pair. For instance, for conjunction one has

    \[<T, T>^{T}, <T, F>^{F}, <F, T>^{F}, <F, F>^{F}\]

For the Sheffer Stroke (as can be checked from its truth table), one has:

    \[<T, T>^{F}, <T, F>^{T}, <F, T>^{T}, <F, F>^{T}\]

There is failure of monotonicity if and only if there is at least one case in which one can proceed down the arrows from a T to an F superscript. This type of diagram is used below to show that the Sheffer Stroke is non-monotonic.

Since there are instances in which there is a shift in truth value of the connective from T to F as one goes down the red arrows, one can infer that this connective is not monotonic.

The set of monotonic unary or binary connectives is:

MC(\mathcal{L}) = \{\wedge, \vee, \top^{2,} \bot^{2}\}

 

The Sheffer Stroke is not among them. Likewise, the Sheffer Stroke fails to be included among the other types of connectives identified above. And, so, by Post’s result, the Sheffer Stroke is functionally complete.

Linearity:

A logical connective is linear (also called countable, counting, or affine) if and only if it is the case that either all or none of the propositional inputs affect the truth value of the output. This means that for a linear connective, and only in the case of such a connective, for each of its inputs, changing the value of the input results in one of the following two cases: either the output value always changes or it never changes. For present purposes, concentrate on unary and binary connectives and proceed straight to presenting a test that can be used to determine whether a connective is linear or not. Here is how the test works: Check the cases in which the connectives takes T as its truth value. Call these the T-cases. Similarly, call the rest the F-cases. Then count the number of input variables that take T. If, and only if, the connective is linear, this number is always even for the T-cases and odd for the F-cases, or it is always odd for T-cases and even for the F-cases. One can show that this is not the case for the Sheffer Stroke. Hence, the Sheffer Stroke is not countable.

The rule is violated. The input variables that take T are even when the connective takes T as its truth value; but for the cases in which the connective takes F as its truth value, there is a mixture of even and odd numbers of input variables that are T. Hence, the Sheffer Stroke is not a linear connective.

The linear unary and binary connectives are as follows, and the Sheffer Stroke, again, is not among them.

L(\mathcal{L}) = \{\top^{1}, \bot^{1}, \neg, \nleftrightarrow, T^{2}, \bot^{2}\}

 

Self-Duality:

Consider the case of unary and binary connectives. The dual of a unary or binary connective, (f(p))' and (f(p, q))' respectively, can be defined as follows:

    \[f(p))' \stackrel{\text{def}}{=} \neg f(\neg p); (f(p, q))' \stackrel{\text{def}}{=} \neg f(\neg p, \neg q)\]

 

Interestingly, the dual of the Sheffer Stroke is the other Sheffer function, NOR.

    \[(p \mid q)' = \neg (\neg p \mid \neg q)) \leftrightarrow \neg \neg (\neg p \wedge \neg q) \leftrightarrow (\neg p \wedge \neg q) \leftrightarrow \neg (p \vee q) \leftrightarrow (p \downarrow q)\]

 

Now, a connective has the property of self-duality if and only if it is its own dual. As just seen, the dual of the Sheffer Stroke is the other Sheffer function, NOR; hence, the Sheffer Stroke does not have the property of self-duality. It is not among the members of the set of self-dual unary and binary connectives of the standard propositional logic.

SD(\mathcal{L}) = \{\neg, 1, 2, \neg 1, \neg2 \}

 

Truth-Preservation and Falsehood-Preservation:

Finally, consider the two remaining properties of connectives, of interest for these purposes: truth-preservation and falsehood-preservation. It can be shown again that the Sheffer Stroke lacks these properties as well.

A connective is truth-preserving if and only if it yields the truth value T for all cases in which all its variable inputs are T.

In the general case of an n-ary connective, one has:

    \[|f(T, \ldots, T)| = T\]

 

A connective is falsehood-preserving if and only if it yields the truth value F for all cases in which all its variable inputs are F.

    \[|f(F, \ldots, F)| = F\]

 

The truth-preserving and falsehood-preserving unary and binary connectives of the standard propositional logic are given below, and, once again, the Sheffer Stroke is not among them.

TP(\mathcal{L}) = \{id, \top^{1}, \wedge, \vee, \rightarrow , \leftarrow , \top^{2} \}
FP(\mathcal{L}) = \{id, \bot^{1}, \wedge, \vee, \nrightarrow , \nleftarrow , \bot^{2} \}

 

The same tests could be applied on the other Sheffer function, the NOR connective, to show that this connective is also excluded from all these sets. No other unary or binary connectives would be excluded from all these sets. Therefore, the Sheffer Stroke and NOR are functionally complete and are the only connectives (among unary and binary connectives) that are functionally complete.

It can be shown that any logical connective, regardless of arity, is functionally complete if it has a property that is called complete symmetry. (see Bimbó, 1992)

It can then be ascertained that no unary connectives have this property and that the only binary connectives that have the property are the two Sheffer functions.

A binary logical connective f is completely symmetrical if and only if the following conditions hold. (“| \varphi |” denotes truth value.)

    \[|f(T, T)| = F\]

    \[|f(F, F)| = T\]

    \[|f(T, F)| = |f(F, T)|\]

In the literature, other functionally complete connectives (or, rather, their associated Boolean functions) are also called Sheffer functions. This applies in the case of non-standard or alternative logics, but these fall outside the scope of this article. In the case of the standard propositional logic, a Sheffer function is a function of any arity n (n \geq 2) that is, taken by itself, functionally complete. The relevant fact to consider is this: Regardless of arity, a connective is functionally complete if it is completely symmetrical. This result applies only in the case of the standard (two-valued) propositional logic.

The definition of a completely symmetrical n-ary f^{n} connective is now given:

    \[|f^{n}(T, \ldots, T)| = F\]

    \[|f^{n}(F, \ldots, F)| = T\]

For all other cases (that is, when p_{1}, \ldots, p_{n} are not all T or all F):

    \[|f^{n}(p1, \ldots, p_{n})| = |f^{n}(\neg p_{1}, \ldots, \neg p_{n})|\]

 

Here is a suggestive sketch of a proof of the fact that f^{n} is functionally complete insofar as it is completely symmetrical. (see Bimbó, 1992)

Assume a completely symmetrical n-ary connective, f^{n}. Now, take the case of the truth table that can be constructed for

    \[f^{n}(p, \ldots, p, q, \ldots, q)\]

. This truth table must have only four rows, since there are exactly two propositional variables; it will have n columns since the function is n-ary.

Consider the results from the truth table above.

Since the connective is completely symmetrical, it must return or yield the same truth values (either T or F) for input values <T, F> and <F, T>. This yields exactly two cases: one in which the two truth values are T and one case in which the two truth values are both F. The first case has the truth table for the Sheffer Stroke; the second case has the truth table for the NOR connective. Thus,

a. f^{n}(p, \ldots, p, q, \ldots, q) \leftrightarrow (p \mid q), or
b.f^{n}(p, \ldots, p, q, \ldots, q) \leftrightarrow (p \downarrow q)

 

In either case, the connective can be defined in terms of a functionally complete connective (either the Sheffer Stroke or NOR).

Accordingly, every definable function can be defined in terms of f since f is itself definable in terms of a functionally complete connective. This shows that f is itself functionally complete.

Apply the Post Test to determine whether a given set of functions is functionally complete,which means that by using only the functions in the set, all mathematically possible functions of the formal language can be defined. There are examples of sets of functions of the standard propositional logic, which are functionally complete, and one can see how the members of these sets lack, taken together, the hereditary properties discussed above. The Sheffer Stroke, and the Peirce Arrow, lack all those properties; therefore, the one-member sets that have as their single members the Sheffer Stroke or the Peirce Arrow are functionally complete. On the other hand, some sets are not functionally complete because some of the identified hereditary properties are not lacked by any one function in the given set. “TP” abbreviates “Truth-Preservativeness”, “FP” abbreviates “Falsehood-Preservativeness”, “SD” abbreviates “Self-Duality”, “M” abbreviates “Monotonicity”, and “L” abbreviates “Linearity.” Lacking the property is indicated by “x” while having the property is labeled by “+”. Thus, look for a set to have some “x” underneath each property across the row if this set is to be functionally complete.

It can be shown that the Sheffer Stroke possesses the property of functional completeness by examining its polynomial representation, which were introduced in section 1a; and the result is:

    \[f_{\mid}(x, y) = xy + 1\]

 

Linearity can be defined for the polynomial representations of the functions as absence of any multiplicational products from the polynomial. (This also means that all definable unary functions are linear since they have the general form

    \[f_{*}(x) = \alpha x + \beta\]

So, no unary function can be functionally complete since it has to be linear.) By examining the polynomial representation of the Sheffer Stroke we see that it is non-linear as it has a multiplicative product in it. Therefore, it lacks the hereditary property of linearity.

Next, show that it also lacks monotonicity.

    \[0 \leq 1\]

    \[f_{\mid}(0, 0) = 1 \nleq f_{\mid}(1, 1) = 0\]

 

Next, establish that the Sheffer Stroke is not self-dual. For a binary function in polynomial form, the self-duality condition can be given as follows.

    \[f_{*}(x, y) = f_{*}(1 + x, 1 + y) + 1\]

    \[f_{\mid}(x, y) = xy + 1 \neq f_{\mid}(1 + x, 1 + y) + 1 = (1 + x)(1 + y) + 1 + 1 = 1 + x + y + xy\]

 

In fact, the dual of the Sheffer Stroke is the other binary function that is functionally complete, the Peirce Arrow, whose polynomial representation is indeed:

    \[1 + x + y + xy = 1 + f_{\vee}(x, y)\]

.

Finally, it can be shown that the Sheffer Stroke is not truth-preserving and is not falsehood-preserving.

    \[f_{\mid}(1, 1) = 1 \times 1 + 1 = 1 + 1 = 0\]

    \[f_{\mid}(0, 0) = 0 \times 0 + 1 = 0 + 1 = 1\]

 

5. Significance of the Sheffer Stroke for Mathematical Logic, Philosophical Logic, and Philosophy

The significance of the Sheffer Stroke connective for mathematical logic and metalogic (the study of formal systems of logic) is evident from the observations made in the preceding section regarding the properties of this connective. Those properties are shared by its dual, the NOR connective. These two connectives are the only binary connectives that are functionally or expressively complete. They are also the first such connectives discovered to have this property as one ascends from zeroary or unary connectives. Examination of such properties belongs to what is known as Metalogic (sometimes called Metatheory). The possibility of economizing in the use of theoretical resources is greatly appealing to mathematicians and scientists. The principle widely known as Occam’s razor roughly states that stipulated entities should not be multiplied beyond the bare minimum of what is needed for a proposed theory to be fully constructible. Economy or parsimony with respect to the resources of a theory is considered a virtue and is demanded methodologically in the sense that, between two theories that have equal explanatory power and/or applications, the one that is more parsimonious should be adopted. It is not claimed that we have some independent insight into the subject addressed by the theories (for instance “nature” or a pre-theoretic structure of independent reality.) What is claimed is simply that parsimony or economy of resources is a methodological and theoretical requirement that good theories must meet.

Formal systems of logic, and formal languages, have expressive resources that are symbolic. Economic use of those resources means using in the construction and implementation of the theory as few such resources as is possible without loss of any systemic powers of expression. Ideally, economy dictates that only one resource of a certain kind is to be used, if such a resource is available or definable and is effective in the construction of all the remaining expressive resources. In the case of the connective symbols of a formal language of propositional logic, this reduction to one effective symbol proves to be feasible in the case of the standard propositional logic: hence, the revelatory significance of Sheffer’s discovery (which, as seen, had already been achieved by Peirce.) For the reduction to be effective, of course, it must be the case that all other connectives (of any arity \geq 1) must be definable in terms of the single connective symbol; in this way all the other connectives can be eliminated as expressive resources without causing a loss of the ability to express what those symbols refer to. Thus, for example, instead of “\neg \varphi”, one can write “\varphi \mid \varphi”, and the same for all other connective symbols.

The advantages obtained from reduction of resources can be concrete in the case of implementations or applications of formal systems. For instance, in the construction of logic gates in electronic circuitry, the gate-types NAND and NOR are the electronic-theoretical interpretations of the same Boolean functions that are propositionally interpreted as the Sheffer connectives. As one ought to expect, NAND and NOR are universal gates. This means that any theoretically definable gate can be actually constructed from using just NAND gates or just NOR gates. Discoveries of this kind signal that a reduction in complexity is feasible, and this result can have economic and design advantages.

In practice, the advantage claimed from this reduction is outweighed by the fact that writing out well-formed expressions becomes prohibitively unwieldy if only one kind of connective symbol is used. For example, in the history of modern logic, Gottlob Frege’s notational variant never had a chance of being widely adopted because of the practically unmanageable demands it placed on typographical execution. One can think of this challenge as posing a trade-off between economy of resources and notational convenience. Or the trade-off is between reducing the type of resource (for instance, gate) used and needed, on the one hand, and the length or extension of the constructions that will have to be made, on the other. For example, to return to propositional logic, to express a well-formed formula like “\neg p \vee q” in terms of a single connective symbol, \mid, one must write out the much longer equivalent well-formed formula shown below. The notational version being used in this way is significantly more unwieldy than a notational version (a grammar) that uses more, not fewer, connective symbols. Consider the formula

    \[((p \mid p) \mid (p \mid p)) \mid (q \mid q)\]

It is possible to adopt conventions that remove its complexity to some degree. For instance, stipulating that “\varphi \mid \varphi” is to be written as “\varphi^{2}”, permits simplification of the formula above to

    \[(p^{2})^{2} \mid q^{2}\]

It is less obvious whether there is a deeper philosophical significance of the fact that a connective like Sheffer’s Stroke is available in a system of logic. Whitehead and Russell expressed boundless enthusiasm about Sheffer’s discovery, hinting only at an underlying significance of this while adopting the connective symbol in the second edition of Principia Mathematica. On the other hand, two other pioneering writers of logic textbooks, Hilbert and Ackermann, were unimpressed and reported on the Sheffer Stroke as if they were referring to trivia. Certainly, the Sheffer functions do not add to the logical system of standard propositional logic in any way. The simplification they make possible is an internal matter. If there are other logics for which, hypothetically, Sheffer functions are not available, this does not automatically mean that there is something wrong with those other systems insofar as they are assessed as formally constructed languages.

It was the influential thinker Ludwig Wittgenstein who attributed far-reaching significance to the fact that Sheffer functions are available. He did this in a somewhat obscure fashion in an influential logical-philosophical work.

a. Wittgenstein’s Tractatus and the Sheffer Stroke

In his Tractatus Logico-Philosophicus (1922, 5.1311, 6.001) Wittgenstein extolled the significance of the Sheffer functions, hinting that discovery of the functions vindicates some of the seminal claims he was raising in this famous text. It is not clear that Wittgenstein knew that there are two binary functions with the same property of being functionally complete. Wittgenstein’s connective symbol may appear, at first blush, to be the same symbol as NOR, which is the connective used by Sheffer himself in his alternative axiomatization of Huntington’s system. Wittgenstein’s connective has been mistaken as such even by Bertrand Russell, but this is a mistake. Wittgenstein uses a rather eccentric function, known in the literature as the N-operator, which has attracted attention and even led to disputes. Although this is not the place to enter into details, a few words are in order about Wittgenstein’s N-operator which is not the sentential NOR operator although it is inspired by it. A technical study of the subject is given by Soames (1983; see also Geach, 1981.)

Wittgenstein’s N-operator is defined over an open-ended set of propositional variables. Because the language that is needed is that of first-order or predicate logic, a propositional variable atom is a predicate symbol, of any arity n, accompanied by n individual constants all of which have as specified referents members of the universe of discourse (or domain.) It is an open problem for Wittgenstein’s language (whose grammar specification is rudimentary) that the domain set may or may not have a denumerably infinite number of subjects. Assuming a finitary domain for this brief excursion, and bear in mind that whatever fixes are available to address problems with Wittgenstein’s operator, are not efficient in the case of an infinite domain. Consider a grammar that comprises symbol letters for 22, individual constants, predicate (non-logical) constants, and the operator symbol. (These are not Wittgenstein’s symbols. Instead he legislates:

    \[\{ẑ, Nẑ\}\]

where the circumflex hints at the recursive mode of defining what expressions are grammatically correct. He uses “ξ” instead of “z” for molecular, not necessarily atomic, well-formed formulas.)

    \[\{x/a_{i}/F_{j}^{n}/N\}\]

Then, application of the N-operator is, by definition, to negate all atomic propositions

    \[\ulcorner F^{n}a_{1} \ldots a_{n}\urcorner\]

in the set. This means that the N-operator can be defined through the following logical equivalences (insofar as the additional symbols are allowed in the metalanguage). The symbols for the existential and universal quantifier are “\forall” and “\exists”. These are missing from Wittgenstein’s language which is more parsimonious; but, as will be seen, Wittgenstein’s language, constructed on the N-operator, is expressively incomplete! Take, as example, the case of a unary predicate constant:

    \[N\{Fa_{1}, \ldots, Fa_{n}\} \leftrightarrow \forall x \neg Fx \leftrightarrow \neg \exists xFx\]

 

One could then proceed to iterated applications of the N-operator, which will now give a clue as to how Wittgenstein’s operator is expressively incomplete.

    \[N(N\{Fa_{1}, \ldots, Fa_{n}\}) \leftrightarrow \forall x \neg (\forall x \neg Fx) \leftrightarrow \forall x\exists xFx \leftrightarrow \exists xFx\]

 

The symbolic language cannot sort out more than one individual variable within the scope of another variable. It can express a formula like the following:

    \begin{multline*}N \lbrack N\{F_{1}a_{1}, \ldots, F_{1}a_{n}\}, N\{F_{2}a_{1}, \ldots, F_{2}a_{n}\}\rbrack \leftrightarrow \forall x \neg (\forall x \neg F_{1}x \vee \forall x \neg F_{2}x) \leftrightarrow \\ \forall x(\exists x_{1}F_{1}x \wedge \exists x_{2}F_{2}x) \leftrightarrow (\exists x_{1}F_{1}x \wedge \exists x_{2}F_{2}x) \end{multline*}

But the language lacks the resources to express formulas like the following, for which differentiation of individual variables within scopes is required:

    \[\forall x \exists yFxy\]

    \[\exists y \forall xFxy\]

 

Interestingly, the language also lacks resources for expressing \ulcorner \exists x \neg Fx\urcorner . As Soames shows (1985), the defect can be remedied by adopting some additional symbolic convention that permits differentiation of individual variables within scopes. Thus, ironically, Wittgenstein’s constructed analogue to a Sheffer function, his N-operator, lacks expressive completeness. The set \{N\} dispenses with the need for other connective symbols, and also for quantifier symbols (of which Wittgenstein thinks are defined through inclusive disjunction or conjunction, again disregarding the prospect of an infinite domain); yet, the language cannot express all constructible formulas of first-order logic. It was Moses Schöfinkel, the originator of combinatorial logic, (Bimbo, 2010) who constructed a functionally complete language for first-order logic using one Sheffer function.

To conclude, consider the discussion of functional completeness, as touted by Wittgenstein in the Tractatus, putting aside the vicissitudes of his symbolic language. Although Wittgenstein claimed that the main subject of his Tractatus is ethical, the work examines a plethora of philosophical and logical subjects. An oft-discussed overriding objective of the work is to demarcate the limits of language; what cannot be expressed by language can be “shown,” as Wittgenstein famously claimed. The present subject fits within the Tractatus’ discussion of the nature of propositional logic and its relationship to the task of elucidation of meaning. (See Wittgenstein.)

Bursting into the scene on the heels of advances in modern logic made by Frege and Russell, the Tractatus is remarkable for its contributions to the philosophical discussion of the new logic as an instrument for clarification of logical meaning. Wittgenstein later abandoned the work’s objective of constructing an ideal formal language that would be “isomorphic” to the world of empirically ascertainable facts; he also moved away from a version of the Correspondence Theory of Truth that seems to be underpinning the Tractatus.

In the Tractatus, Wittgenstein explains that the logic of our theories about the world is not itself to be sought in the world. Let us assume that “A” symbolizes the proposition expressed by the sentence “snow is white” and “B” symbolizes the proposition “snow is a kind of precipitation.” Let us also assume for our present purposes that the truth or falsehood of propositions A and B are to be established by referring to empirical facts. It so happens in this example that both propositions, expressed by the two English sentences, are true in our actual, empirically accessible, world. Now form the compound proposition “A and B.” This new proposition must be true because both its component propositions are true. This is evident because the meaning of “and.” But how is this known? The empirical world itself does not come to our assistance. We know this regardless of empirical experience: what we know is that any compound proposition of the logical form “p and q” has to be true if, and only if, both of its components, the individual or atomic propositions p and q, are true. Thus, given p and q, the conclusion “p and q” follows validly: it is logically impossible to have a case in which the given premises are all true but the conclusion is false. Nevertheless, the logical meaning of any conjunctive proposition of the logical form “p and q” is identical with its truth conditions which comprises the determinate relations between truth value assignments to the components (whether p and q are true or false) the functionally determined truth value of the whole conjunction. Thus, the empirical fact that the conjunctive sentence is true in our actual world is irrelevant from the standpoint of the logical meaning (the truth conditions) of the logical form exemplified by the sentence “snow is white and snow is a form of precipitation.” The valuation dependency

    \[<<T, T>, T>\]

is one of four logically possible combinations which comprise the truth conditions of the conjunctive logical form:

    \[\{<<T, T>, T>, <<T, F>, F>, <<F, T>, F>, <<F, F>, F>\}\]

The actual world is not logically privileged, and Wittgenstein’s conceit that an isomorphic mapping can be accomplished, which would produce an ideal language of comprehensive applicability, was bound to be frustrated. Disregarding this rather metaphysical aspect, which Wittgenstein later also disregarded, the Tractatus contains an astute understanding and analysis of the formal logical instrument that has arisen out of modern mathematical developments. Wittgenstein’s contribution to the discussion of functional completeness fit under this aspect of the work.

Wittgenstein makes the point that “internal,” or “structural,” features of propositional forms account for truth preservation from the joint premises to the conclusion of a valid argument form. It is structural features that account, for instance, for the equivalence of logical meaning between any two propositions. This means that the propositions have forms that receive the same truth values for the same valuations (truth value assignments to their components.) Cases or valuations (also called interpretations and models) are determined by assigning truth values, true and false, to all the components of a propositional form. Wittgenstein uses the term “truth grounds” and “(logically) possible worlds” when referring to truth value assignments or valuations. Wittgenstein says that “these relations are internal and they exist as soon as, and by the very fact, that the propositions exist.” (1922, 5.13) The next thesis in Wittgenstein’s text (5.1311) is the one in which he uses his N-operator. The point made there is now presented roughly: having briefly examined the complications that arise out of Wittgenstein’s definition of an N-operator, one adjusts, instead, to a propositional language, pretending that Wittgenstein actually used the NOR function to make his case. Nothing is lost in this way because the point is to illustrate Wittgenstein’s remarks on the significance of functionally complete operators rather than to pursue further any details attaching to the N-operator itself.

Consider a valid argument form:

    \[p \vee q, \neg p \vdash q\]

 

The usual name of this valid argument form is Disjunctive Syllogism. This is not a string of propositional forms; it is a schema, and so is something like a recipe for how to proceed correctly when drawing inferences. Wittgenstein makes the point that conventions of symbolism may create the wrong impression that there is no internal, structural connection running through all propositional forms; that there is something newly productive introduced by the multiple (connective) symbols. This, however, would be wrong. The accidental fact that many different symbols are used is what is misleading. Moreover, Wittgenstein has philosophical objections to working from the semantic side of constructing logical systems, and this has consequences for the subject under discussion. Wittgenstein considers semantic attempts to be nonsensical: for instance, to specify the referent of conjunction, in order to obtain a working semantics, commits one to the nonsense of speaking about extra-empirical items and, indeed, about things that cannot be talked about. This way of thinking shows certain underlying philosophical assumptions, which lie beyond this article’s scope, but the problem that arises is this: The construction of a logical system is to be understood as a matter of specifying formal-grammatical rules for concatenating and transforming the available symbolic resources of the system. Because of this, the failure of the grammatical or syntactical setup to show perspicuously what happens in logical operations is serious. Hence, it is imperative to show solely by manipulating the symbolic resources that there is an internal structural connection that relates all possible transformations. This is accomplished by using only one functionally complete operator symbol. This is the reason Wittgenstein extols the “discovery”. Even if one opts to multiply connective symbols, because of the greater simplicity and even intuitive appeal gained in that way, it is still crucial to be able to show that only one connective symbol suffices. Indeed, as is known from the above study of functional completeness, one could have opted for eliminating all but one connective symbol, one of the Sheffer functions. Consider further how the claim is to be made that single-connective symbolism reveals something deeper about logic itself.

Logical properties are structural features of the forms: thus, one can have tautologous, contradictory, and indeterminate (also called contingent and indefinite) propositional forms. All tautologies would have to have the same referent which, in the Fregean analysis, is the truth value true. If semantic referents are rejected, however, that leaves the grammatical means for showing the collapse of all tautologies, namely that they all have the logical meaning. The same is the case with all contradictory logical forms; they check as false for all logically possible assignments of values to their components. The remaining structural type, the contingent propositional form, is basically not logic’s business! This is indicated by the convention of assigning both truth values to a single propositional variable to generate two cases: these are two logically possible worlds if one is to semantically model the setup. The proposition can logically be true in one case and false in another; as a proposition it must be one or the other and it is not logically possible for it to be both true and false. Notice then that the two logical possibilities (p-T and p-F) have the same status. It does not matter if one of those, for an interpretation of the propositional symbol, happens to be the actual world. Logic, not depending on the workings of the empirical world, is attuned to characteristics that are invariable across all possible cases: this means, tautologies, which are true in all logically possible cases, and contradictions, which are false in all logically possible cases. The validity of the inferential schema above guarantees, for two-valued logic, that the following is a propositional tautology:

    \[\vdash ((p \vee q) \wedge \neg p) \rightarrow q\]

 

Once again, the proliferation of symbols obscures the facts about the internal structural simplicity of logic. All compound propositional forms are internally connected because they result from elementary propositional forms by means of connectives. The logic is determined by how the logical connectives are defined. Starting with elementary (also called individual or atomic) propositions, one always proceeds by combining them by means of connectives: the compounds generated are in every case dependent for their meanings (truth and falsehood) on the meanings (truth and falsehood) of their components. If one were to proceed in the opposite direction, from compounds toward the elementary propositions, there would be a decomposition of the compound propositions; the process would terminate with the elementary propositions. This is possible because all the connectives are truth-functional connectives. Thus, if, for instance, “p and q” is given as true, one can dissolve this into “p is true” and “q is true” given the definition of “and.” Once again, one sees that propositional forms are related with each other and, ultimately, they are related to two basic propositions, the true and the false, out of which any complex can be generated by using truth-functional connectives. This also shows that nothing in the logic of propositions can ever be arbitrary.

The symbolism that uses multiple connective symbols obscures this. A stronger point can be made: Something is wrong with a notational idiom, a symbolism, that fails to capture the identity of logical meanings (logical equivalence). For instance, consider the following two logically equivalent expressions or formulas, which are well-formed, it is assumed, in the idiom or notation (and represented here in the symbolically enriched metalanguage):

    \[\neg (p \wedge q) \dashv \vdash (\neg p \vee \neg q)\]

 

Even though the expressions are logically equivalent, the grammatically correct formulas representing them are not the same! This may be considered a radical notational or formal-grammatical defect. It gets even worse. There is a view that formalism is fundamentally a matter of systematic and specified manipulation of symbolic resources. Consequently, the defect faced in this case goes all the way to the roots of the most basic task of all: how to construct a faithful symbolic system relative to a given purpose. In that case, it would appear that the correct way to construct a formal system is exclusively through its minimally functionally complete sets of operators. If one has to switch to alternative idioms that have redundant operators in them (operators that can be defined by the other operators in the system), that would have to be justified by pleading such a reason as expediency or convenience.

The symbolic notation of a formal language idiom that uses only one connective symbol would remove this notational illusion, or, to make the stronger case, would remedy the deep formal-grammatical defect: then it could be perspicuously shown that all that is had is an unfolding of internal connections that run across propositional forms. Wittgenstein proceeds to write the above argument schema by using one single connective symbol which allows elimination of the symbols for disjunction and negation in order to make “the inner connection” obvious. (The contemporary symbol for the connective is the one used by Wittgenstein, which is NOR.)

    \[(p \downarrow q) \downarrow (p \downarrow q), p \downarrow p \vdash q\]

 

To do this, replace “p \vee q” by “(p \downarrow q) \downarrow (p \downarrow q)” (thus eliminating the inclusive-disjunction symbol) and “\neg p” by “p \downarrow p” (thus eliminating the negation symbol). The NOR symbol is used to effect both eliminations. As a result, we have the schema shown above, in which only one connective symbol is used. Of course, one could have used the NAND or Sheffer Stroke function to effect the same elimination, in which case the result would be:

    \[(p \mid p) \mid (q \mid q), p \mid p \vdash q\]

 

Moreover, when multiple logical connectives are used in the construction of a formal system, an impression of arbitrariness may be created. Why, one may ask, is one set of logical connectives used instead of another set? The right answer is that nothing depends on which connectives are used because all the propositional formulas are internally related in strict, non-arbitrary fashion, and the construction ultimately depends on the basic building blocks and connectives. To illustrate this point, construct a formal system of the standard propositional logic by using as its set of connectives either

    \[\{\neg, \rightarrow\}, \{\neg, \vee\} or \{\neg, \wedge\}\]

. Basically, it amounts to the same thing whichever one is used. This is not immediately obvious regarding the plurality of connectives seen above. But now consider how all of the connectives in these sets are definable in terms of the connective in \{\mid\}. Thus, \{\neg, \rightarrow\} can be replaced by \{\mid\}; and \{\neg, \vee\} can be replaced by \{\mid\}; and \{\neg, \wedge\} can be replaced by \{\mid\}. This fact makes clear that nothing depends on arbitrary choices about the connectives used. This discovery can be used as proof that there is a strict internal connection that runs through all the expressive resources.

Wittgenstein even points out (5.42) that having connectives in the formal system, which are interdefinable, means that they should not be properly regarded as “primitives.”
Now one can revisit the subject of the triviality of tautologies (and of logical contradictions), which is another subject on which Wittgenstein touches. There is one tautology, and a contradiction is the negation of the tautology (for the standard definition of negation.) Of course, negation itself can be expressed in terms of a Sheffer function. The ultimately perspicuous manifestation of the inner structural inter-connectedness of all logical propositions can be shown insofar as all valid tautologies can be derived from one single axiom that uses a single connective symbol. Rules of transformation and inference can be specified, to be applied to the axiom schema, to generate all valid tautologies. This is indeed possible, as French logician Jean Nicod (1917) demonstrated by constructing producing a one-postulate axiomatization of the standard propositional logic. Nicod’s postulate, written with metalinguistic symbols for writing a schema, is:

    \[(\Pi \mid (\Sigma \mid \Psi)) \mid ((\Theta \mid (\Theta \mid \Theta)) \mid ((X \mid \Sigma) \mid ((\Pi \mid X) \mid (\Pi \mid X))))\]

 

An alternative and equivalent formulation of the Nicod Postulate, which avoids having any sub-formulas of the postulate schema being tautologous, is the following. (Notably, in the original formulation, the sub-formula \ulcorner t \mid (t \mid t)\urcorner is tautologous.)

    \[(\Pi \mid (\Sigma \mid \Psi)) \mid ((\Pi \mid (\Psi \mid \Pi)) \mid ((X \mid \Sigma) \mid ((\Pi \mid X) \mid (\Pi \mid X))))\]

 

The Nicod Postulate can be deployed as the single axiom in a formal system whose only rule of inference is given by the following rule schema:

    \[\Pi, \Pi \mid (\Sigma \mid \Psi) \vdash_{nicod} \Psi\]

 

6. References and Further Reading

  • Béziau, Jean-Yves. 2001. “Sequents and Bivaluations”, Logique et Analyse 176, pp. 373-94.
  • Bimbó, Katalin. 2010. “Schöfinkel-Type Operators for Classical Logic”, Studia Logica 95: 355-78.
  • Church, Alonzo. 1996. Introduction to Mathematical Logic, revised and enlarged edition,
    Princeton: Princeton University Press.
  • Church, Alonzo. 1953. “Review of Sobociński (1953), Journal of Symbolic Logic 18: 284-85.
  • Geach, P. T. 1981. “Wittgenstein’s Operator N”, Analysis 41: 168-71.
  • Goodell, John D. and Tenny Lode. 1953. “Decision Elements”, Journal of Symbolic Logic 18:
    283-84.
  • Hilbert, D., and W. Ackermann. 1928. Grundzüge der theoretischen Logik. Berlin: Springer.
  • Houser, N., Roberts, Don D., and Van Evra, James (eds.). 1997. Studies in the Logic of Charles Sanders Peirce. Bloomington, IN: Indiana University Press.
  • Nicod, Jean G. P. 1917. “A Reduction in the Number of Primitives Propositions of Logic”, Proceedings of the Cambridge Philosophical Society 19: 32-41.
  • Peirce, Charles S. 1931-1966. Collected Papers of Charles Sanders Peirce. 8 volumes, ed. by
    Hartshorne, C, Weiss, P. and Burks, A. W.. Cambridge, MA: Harvard University Press.
  • Peirce, Charles S. 1967. “Annotated Catalogue of the Papers of Charles S. Peirce”,
    Manuscripts in the Houghton Library of Harvard University, as identified by Richard
    Robin. Amherst: University of Massachusetts Press.
  • Peirce, Charles, S. 1971. “The Peirce Papers: A supplementary catalogue”, Transactions of
    the C. S. Peirce Society 7
    : 37–57.
  • Pelletier, Jeffrey Francis and Norman M. Martin. 1990. “Post’s Functional Completeness
    Theorem”, Notre Dame Journal of Formal Logic 31: 462-75.
  • Post, Emil L. 1921. “Introduction to a General Theory of Elementary Propositions”, American Journal of Mathematics 43: 163-85.
  • Post, Emil L. 1941. The Two-Valued Iterative Systems of Mathematical Logic, vol. 5 of Annals of Mathematical Studies, Princeton: Princeton University Press.
  • Prior, Arthur N. 1962. Formal Logic. Oxford: Clarendon Press.
  • Quine, Willard Van O. 1995. Selected Logic Papers, enlarged edition, Cambridge, MA:
    Harvard University Press.
  • Read, Steven. 1999. “Sheffer’s Stroke: A Study in Proof-Theoretic Harmony”, Danish
    Yearbook of Philosophy 34
    : 7-24.
  • Riser, John. 1967. “A Gentzen-Type Calculus for Sequents for Single-Operator Propositional
    Logic”, The Journal of Symbolic Logic 32: 75-80.
  • Sheffer, H. M. 1913. “A Set of Five Independent Postulates for Boolean Algebras, with
    Application to Logical Constants”, Transactions of the American Mathematical Society 14: 481-88.
  • Soames, Scott. 1983. “Generality, Truth Functions, and Expressive Capacity in the Tractatus”, The Philosophical Review 92: 573-89.
  • Sobociński, Bolesław, 1953. “On a Universal Decision Element”, Journal of Computing Systems 1: 71-80.
  • Wernick, William. 1942. “Complete Sets of Logical Functions”, Transactions of the American
    Mathematical Society 51
    : 117-32.
  • Whitehead, Alfred and Bertrand Russell. 1910, 1912, 1913. Principia Mathematica, 3
    volumes. Cambridge: Cambridge University Press; 1925, 1927. Principia Mathematica, second edition, 2 volumes, Cambridge: Cambridge University Press.
  • Wittgenstein, Ludwig. 1922. Tractatus Logico-Philosophicus, tr. C.K. Ogden. London:
    Routledge & Kegan Paul.

 

Author Information

Odysseus Makridis
Email: makridis@fdu.edu
Fairleigh Dickinson University
U. S. A.

Material Composition

A material composite object is an object composed of two or more material parts. The world, it seems, is simply awash with such things. The Eiffel Tower, for instance, is composed of iron girders, nuts and bolts, and so on. You and I, as human beings, are composed of flesh and bone, and various organs. Moreover, these parts themselves are composed of further parts, such as molecules, which themselves are composed of atoms, which are composed of sub-atomic particles. Material composite objects are, it seems, ubiquitous. However, despite their ubiquity, a little philosophical reflection on the matter, as is so often the case, reveals that they are also deeply puzzling.

The question which has received most attention from philosophers interested in material composition is: under what circumstances do two or more material objects compose a further object? Why is it, for instance, that a collection of iron girders that are bolted together in the centre of Paris do compose an object (that is, the Eiffel Tower), but that there is no object composed of the Eiffel Tower and the Moon? What conditions are satisfied by the first set of objects, and not by the second set of objects, which make this the case? In short, what are the necessary and sufficient conditions for composition to occur?

Since the 1980s, philosophers have devoted considerable attention to this question, and it has proved difficult to answer. This article provides a survey of the various answers that have been given to this question, plus the arguments that have been offered in their defence.

Table of Contents

  1. Some Important Preliminaries
    1. Mereological Technicalities
    2. Composition and Constitution
  2. The Special Composition Question
    1. Answering the Special Composition Question
  3. Compositional Restrictivism
    1. Simple Bonding Answers
    2. Series-Style Answers
    3. Sorites Paradoxes and Sharp Cut-Off Points
    4. Brutal Composition
    5. Concluding Remarks
  4. Compositional Universalism
    1. Arguments for Universalism
      1. The Argument from Elimination
      2. The Argument from CAI
    2. Arguments against Universalism
      1. The Gratuitousness of Universalism
      2. The Counter-Intuitiveness of Universalism
      3. The Argument from Primitive Cardinality
      4. The Identity Argument
  5. Compositional Nihilism
    1. Arguments for Nihilism
      1. The Causal Overdetermination Argument
      2. The Problem-Solving Argument
      3. The Argument from Ideological Parsimony
    2. Arguments against Nihilism
      1. The Common-Sense Argument
      2. The Argument from Emergence
      3. The Problem of Atomless Gunk
  6. Deflationism
    1. Hirsch and Quantifier Variance
  7. References and Further Reading

1. Some Important Preliminaries

a. Mereological Technicalities

The topic of material composition falls under the wider purview of mereology, which is simply the study of parts and wholes. Much of the focus of mereology over the last hundred years or so has been on producing a formal theory of part–whole relations, that is, a formal theory of the logical relations that hold between parts and the wholes they compose (examples include Lesniewski, 1916; Leonard and Goodman, 1940; Simons, 1987). The current entry will overlook much of the formal side of the study of mereology, and will instead concentrate on some of the key metaphysical questions concerning the nature of material composite objects, such as whether there are any such things, and what criteria some things need to satisfy in order to compose a composite object. However, it will be useful in the first instance to define a few of the key technical terms and expressions that are peculiar to the field of mereology:

  • Part:

The term ‘part’ has a slightly different meaning in mereology to that which it has in ordinary language. In ordinary language, we use the term part to mean a portion or subsection of an object, for example, the Earth is part of the Solar System, the tail is part of the cat and so forth. In mereology, however, the term is used such that not only are an object’s subsections its parts (for example, the tail is part of the cat), but objects are also taken to be parts of themselves (for example, the cat is part of the cat). So if you were tasked with writing an exhaustive list of all the cat’s parts, on this understanding of the term, you should include the cat itself on the list.

  • Proper Part:

Philosophers have taken to distinguishing parts from what are called ‘proper parts’. ‘Proper part’ is the mereological term that would best tally up with our ordinary or common-sense use of the term ‘part’, in that an object’s proper parts exclude the object itself. Thus, if you were tasked with writing an exhaustive list of all the cat’s proper parts, the cat itself should not be included on the list.

  • Plurally Referring Expressions:

Following Peter van Inwagen (van Inwagen, 1990), it has become common to use the plurally referring expression, ‘the xs’, to refer to some plurality of material objects. This enables one to refer to a number of objects at a time in a neutral manner, without supposing that those objects do (or do not) compose a further object.

  • Composition:

Some xs compose a further object, y =df the xs are all parts of y, none of the xs overlap, and every part of y overlaps at least one of the xs.

(The qualifications about ‘overlap’ in the above definition can make it sound a bit more complicated than it really is. They merely stipulate that one should not list overlapping parts of an object when listing the parts that compose it. For instance, suppose a necklace were made entirely of pearls. In that case, it would be correct to say that the pearls compose the necklace. But, given that the pearls themselves are made of atoms, it would also be correct to say that the atoms compose the necklace. However, it would be wrong to say that the pearls and the atoms compose the necklace, since the pearls overlap the atoms.)

  • Fusion:

y is a fusion of the xs =df the xs compose y

(Note: the term ‘sum’ is sometimes used instead of ‘fusion’.)

  • Simple:

x is simple =df x has no proper parts

(Note: ‘simple’ is sometimes used as a noun, as well as an adjective, thus one might speak of, ‘a simple’, or ‘the simples’.)

These are just a few of the many technical terms involved in formal mereology, and they are defined here quite informally, for ease of understanding. For those interested in formal mereology, Peter Simons’ 1987 book, Parts: A Study in Ontology, provides an excellent place to start.

b. Composition and Constitution

The debate over material composition should be distinguished from a related debate concerning material constitution. Material composition concerns the question of when two or more objects compose a further, composite object. (For instance, if you attach four wooden legs to a flat wooden surface, do those five objects now compose a new object: a table?) Those interested in material constitution, by contrast, are interested in the question of when one object (for example, a lump of bronze) constitutes another object (for example, a statue of Napoleon), and indeed, what the relation of constitution actually consists in. Material constitution presents some real puzzles of its own. For instance, is the lump of bronze a distinct object from the statue of Napoleon, or are they numerically identical? If we adopt the latter view, that is, that there is just a single object there, but one that can be called by different names (that is, ‘lump of bronze’ or ‘statue of Napoleon’), we seem to run into trouble. The trouble emerges when you consider what happens if you were to melt down the statue and form it into a shapeless lump. The lump of bronze, it seems, would still exist; but the statue would clearly not. By melting it down, you destroy the statue, but you do not destroy the lump. This might suggest, therefore, that the lump and the statue were not identical after all. Perhaps, then, we should adopt the former view, and say that the statue and the lump are not identical but, in fact, distinct objects. The problem with this, however, is that it now looks as though, before the melting down occurred, we had two distinct objects occupying exactly the same space at exactly the same time, which, one could plausibly argue, is impossible. This is the central problem of material constitution, and it has generated a considerable literature (see Rea, 1997).

Although the debate over material constitution is certainly a puzzling one, it is quite distinct from the debate over material composition. However, the two are related in certain ways, and there are times at which adopting a view on one debate might well have an impact on one’s view concerning the other. The differences, and similarities, between these two distinct debates, and how they interrelate, will become clearer as the entry progresses.

2. The Special Composition Question

Questions concerning material composition have a long history in philosophy, but they have attracted increased attention over recent years thanks largely to the work of Peter van Inwagen. In a 1987 article, and at greater length in his 1990 book, Material Beings, van Inwagen posed what he called the Special Composition Question (SCQ from hereon). (It is only fair to note here that van Inwagen actually credits Hestevold, 1981, with originally formulating the SCQ, but it is van Inwagen who made it well known). This question can be phrased as follows:

(SCQ): Under what conditions do two or more material objects compose a further, composite object?

In other words: what is required in order for some objects to be parts of another object? Or as van Inwagen has put it, if you had two objects, what would you need to do to them in order to get them to compose something?

It is perhaps worth noting here that van Inwagen called this the ‘Special’ Composition Question, in order to differentiate it from what he called the ‘General Composition Question’ (GCQ). The GCQ asks the broader question of what the composition relation actually is, in general. Van Inwagen was sceptical about the prospects of answering this question, stating he did not even know how to approach it, let alone answer it. It seems that most philosophers have followed suit, as there is not a great deal of literature on the GCQ. (However, see Hawley, 2006, for an attempt to shed some more light on the matter).

a. Answering the Special Composition Question

A satisfactory answer to this question should take something like the following form:

(ANSWER): for any xs (where those xs are material objects), there is a further material object, y, composed of those xs if and only if _______________________________________.

The task, therefore, is to fill in the right-hand side of the above biconditional. But as van Inwagen went on to show, this is no easy task. In particular, it seems very difficult to provide a principled and systematic answer to the SCQ that accommodates our common-sense intuitions about when composition does and does not occur. In the end, he concluded that it is impossible to provide such an answer. Instead, his own answer is radically counter-intuitive. Van Inwagen’s answer to the SCQ, which has come to be known as ‘organicism’, is:

(ORGANICISM): for any xs (where those xs are material objects), there is a further material object, y, composed of those xs if and only if the collective activity of those xs constitutes a life.

The reason that this answer is so counter-intuitive is that if it is true, it means that the only composite objects in existence are living beings. Inanimate composite objects, according to this view, do not exist. There are no cars or buildings, tables or chairs, planets or stars and so forth. Van Inwagen recognises just how radical this view is—indeed, he calls it ‘the denial’—but he insists that a thorough analysis of the SCQ leads inexorably and inevitably to it. Van Inwagen’s own answer to the SCQ has not proved to be all that popular. However, the SCQ itself has generated huge amounts of subsequent interest and swathes of further literature.

It is important to note that any proposed answer to the SCQ will fall into one of the three following categories:

  1. Compositional Universalism:

Whenever you have two or more material objects, there is always a further object that they compose.

  1. Compositional Nihilism:

No objects compose, and no objects have parts. That is, there are no composite objects in existence.

  1. Compositional Restrictivism:

Some collections of material objects compose further objects, but others do not.

Each of these approaches to the SCQ comes with its own merits and demerits, and each has been defended (and attacked) in the contemporary literature. Van Inwagen’s own answer falls into the third category, as it says composition occurs sometimes, but only sometimes (specifically, when some xs partake in collective activity which constitutes a life). What follows will survey some of the central arguments that have been given for and against each of these three positions.

3. Compositional Restrictivism

There is one very compelling reason to think that some variety of compositional restrictivism must be true: common sense. On first inspection, it seems simply obvious that composition is restricted, that is, it occurs sometimes, but not all the time. After all, one does not need to engage in much serious reflection to realise that the Eiffel Tower, for instance, is composed of iron girders, and that the Great Pyramid of Giza is composed of limestone blocks. Yet equally obvious is the fact that there is no object which these two great edifices together compose (that is, there is no object which has just the Eiffel Tower and the Great Pyramid of Giza as parts). So, since it is plainly evident that there are some cases in which objects do compose and other cases in which they do not, it also seems plainly evident that composition must be restricted.

The challenge for the restrictivist, however, is to formulate an answer to the SCQ that accommodates these common-sense intuitions. That is to say, she must specify the necessary and sufficient conditions under which composition occurs, such that they are satisfied by the iron girders in Paris (which compose the Eiffel Tower), and the limestone blocks in Giza (which compose a pyramid), but not satisfied by the girders and blocks taken together (so that we do not end up with some rather unusual composite pyramid-tower, or suchlike). The literature that has emerged on this topic shows that providing such an answer is no easy task.

a. Simple Bonding Answers

In Material Beings, Peter van Inwagen tried to formulate an answer to the SCQ that preserves some of our common-sense intuitions about composition. He noted that these intuitions very often seem to be based on certain facts about how objects are grouped or connected together. That is, we often seem to think that objects compose a further object if they are bonded together in some appropriate way. The reason that the iron girders in Paris compose a tower, for instance, is that they are fastened together with many millions of bolts and rivets, and what have you, to form a solid and rigid structure. Moreover, the reason that the Eiffel Tower and the Great Pyramid of Giza do not compose a material object is precisely because they lack any such bonding or unity; they are completely distinct and disconnected objects separated by well over a thousand miles. Perhaps, then, bonding could be the secret to unlocking the SCQ?

Van Inwagen labelled his first attempt at a bonding-style answer to the SCQ, CONTACT. Very simply, it states that objects need to be physically touching one another if they are to compose a further object.

(CONTACT): for any xs (where those xs are material objects), there is a further material object, y, composed of those xs if and only if the xs are in contact with one another.

Although this answer certainly does give us the intuitive result that the collection of iron girders in Paris do compose a tower, and the limestone blocks in Giza do compose a pyramid, it also entails certain conclusions which simply fly in the face of common sense. As van Inwagen notes, if CONTACT were true, it would mean that every time you shook someone’s hand, a new material object would instantaneously pop into existence, only to vanish back into nothingness once the handshake ceased. The sheer absurdity of this consequence seems to suggest that CONTACT cannot possibly be the correct answer to the SCQ, particularly when you remember that what originally motivated it was a desire to preserve common sense.

Van Inwagen went on to consider a number of other bonding-style answers to the SCQ, which he called FASTENING, COHESION, and FUSION. Each of these solutions involves a greater strength of bond than the last, culminating in FUSION, whereby for objects to compose, they must be fused together, which means they must be ‘melt[ed] into each other in a way that leaves no discernible boundary’ (Van Inwagen, 1990, 59).

In light of the above comments, however, it should be fairly straightforward to see that none of these answers are going to work (at least, none of them will satisfy common sense). If we return to the example of two people shaking hands, it seems evident that even if you stick their hands together, even if you fuse them with an unbreakable adhesive, you will never make them compose a single object. You will simply have two objects—two distinct persons—in the rather unfortunate situation of being stuck.

Moreover, all these bonding answers fail to account for the possibility of what are known as scattered composite objects—that is, composite objects whose parts are not in contact with one another. But common sense suggests that there are in fact such scattered objects. A bikini, for instance, seems to be an ordinary composite object, yet it is composed of two distinct, and spatially separated, parts. Or the USA, to give another example, seems to be a composite object, yet it is composed of spatially disconnected parts—the island of Hawaii is separated from the mainland by a considerable distance, as is Alaska. If any variety of bonding-style answer were correct, then it would turn out that there are not, in fact, any bikinis in existence, and even more worryingly, many Hawaiians and Alaskans would lose their country of residence! Bonding-style answers, therefore, have found very few supporters.

b. Series-Style Answers

Van Inwagen then went on to consider the idea that there is perhaps not a single, one-size-fits-all answer to the SCQ, but instead, that different criteria will apply to different types of objects, according to which they will compose or fail to compose. The thought is that the criteria that a bunch of cells need to satisfy in order to compose a human being, for instance, might be very different from the criteria that a bunch of bricks might need to satisfy in order to compose a house. If this is right, then perhaps when answering the SCQ, we need to set out the specific criteria of composition for different types of material object. Such answers have come to be known as series-style answers (SSAs), since they will consist of a long series of different criteria that different types of object must satisfy in order to compose. A SSA to the SCQ will look something like the following:

(SSA): for any xs (where those xs are material objects), there is a further material object, y, composed of those xs if and only if the xs are F1s and stand in relation R1, or the xs are F2s and stand in relation R2, or …, the xs are Fns and stand in relation Rn.

The attraction of this kind of answer is that it looks like it might accommodate certain intuitions we have about composition, such as the fact that by fastening bricks together with cement, you can compose a further object (for example, a house), but by fastening human beings together with cement, you cannot.

Van Inwagen was fairly quick to dismiss the prospects of a satisfactory SSA to the SCQ, however, as he thought that they suffered from a number of difficulties. One of the main problems he foresaw was that a SSA to the SCQ would violate the transitivity of parthood, which he took to be an unacceptable consequence. It is clear to see why one might well assume that parthood is a transitive relation. For if x is a part of y, and y is a part of z, then it just seems evident that x must also be a part of y. For example, if the bearing is part of the wheel, and the wheel is part of the car, then the bearing must also be part of the car.

Van Inwagen claimed, however, that SSAs to the SCQ would violate this transitivity. For instance, suppose we endorsed a SSA that included the fact that xs composed ys if and only if they were related by R1, and ys composed zs if and only if they were related by R2. In that case, an x could be a part of a y which was itself part of a z, yet x would not be part of z (because, as per the answer, xs cannot compose zs; zs can only be composed by ys related by R2).

Since the publication of Material Beings, surprisingly, little attention has been paid to the possibility of SSAs. However, some recent work on the topic suggests that van Inwagen’s dismissal of such answers may have been a little hasty. Silva (2013) has responded to van Inwagen’s objections and shown that SSAs need not be inconsistent with the transitivity of parthood. Carmichael (2015) has gone one step further and formulated a clearly defined SSA to the SCQ—one which he claims satisfies our common-sense intuitions about composition and which overcomes van Inwagen’s objections.

c. Sorites Paradoxes and Sharp Cut-Off Points

A significant problem that affects all restrictivist positions (or, at least, virtually all of them—see the following section, 3d, for one exception to this) is that they are susceptible to sorites-style arguments. This style of argument takes its name from the ancient sorites paradox, or the paradox of the heap (Soros, from where the term ‘sorites’ derives, is Greek for ‘heap’). The paradox, which is usually accredited to the Greek philosopher, Eubulides, is simple to set up. First, consider a single grain of rice. It seems quite clear that a single grain of rice is not a heap of rice, and neither is two grains, nor three. But if we had ten thousand grains, we would most certainly have a heap. The paradox arises because it is difficult, if not impossible, to state the precise point at which the heap emerges. The crucial thought that drives the paradox is that a single grain of rice, it is supposed, is simply not significant enough to make the difference between a heap and a non-heap. Adding or removing just one grain of rice could never create or destroy a heap. But if this is right, it seems to follow that if you start off without a heap, then by adding grains one at a time, you will never be able to make a heap, no matter how many grains you add. Conversely, if you start off with a heap, then by removing grains one at a time, you will never get rid of the heap, even if you were to remove all the grains! Thus, the paradox ensues.

The force of the sorites paradox strikes right at the heart of compositional restrictivism. To see why, just consider any ordinary composite object; let us say a chair. That chair will be composed of many billions of atoms, each one of which will be very small indeed. Now when you consider just how small a single atom actually is, it seems quite clear that the difference of a single atom could not possibly make the difference of there being a chair or there not being a chair. To suppose otherwise seems, frankly, preposterous. But, now suppose that, with some ultra-high-precision tweezers, you began the long and laborious task of removing atoms from the chair, one by one. Eventually, you would reach a stage at which you had removed all the atoms except one; at which point, you would clearly no longer have a chair in front of you. (A single atom doth not a chair make.) What seems to follow from all this is that there must be a cut-off point at some stage of the atom-removal process at which the removal of a particular atom makes the composite object—the chair—suddenly cease to exist. To many, however, this is a simply fantastical proposal! To suppose that a single, nugatory atom could make the difference between a chair’s existing and not existing is a very hard conclusion to swallow.

These sorts of considerations, concerning sorites-style arguments and sharp cut-off points, have led many to believe that restricted composition, in any of its possible guises, is untenable. Peter Unger (1979; 1980) is perhaps the most notable advocate of using sorites-style arguments against the existence of ordinary objects.

It remains to be said, however, that although these sorites-style arguments certainly have force, they are not without opposition. Both Korman (2015) and Carmichael (2011), for instance, have articulated responses to the arguments and maintain a resolute conviction that composition is restricted.

d. Brutal Composition

Ned Markosian is one of the few philosophers who has persevered with restrictivism. In a 1998 paper, he outlines and argues for a novel view which he calls ‘Brutal Composition’. According to this view, ‘there is no true, non-trivial, and finitely long answer to the SCQ’ (Markosian, 1998, 213).

Instead, Markosian claims that whenever composition does or does not occur is simply a brute fact. That is to say, it is a fact, but it does not obtain in virtue of any other facts, and there can be no illuminating explanation of why it obtains. It is a fact, and that is just the way it is.

On this view, then, the iron girders in Paris do compose the Eiffel Tower, and the limestone blocks in Giza do compose the pyramid. Likewise, it is also true that the Eiffel Tower and the Pyramid, taken together, do not compose any further object. These are just some of the facts about composition that obtain in the world. According to Markosian, however, there is no principled explanation of why these facts obtain. They just do.

An advantage of Markosian’s view, he claims, is that it is capable of accommodating all of our common-sense intuitions about composition (although this could be resisted—see below). Ordinary composite objects really do exist, and exotic, gerrymandered composite objects (like the object composed of the Eiffel Tower and the Great Pyramid) do not.

Likewise, brutal composition has a clear answer to the sorites-style arguments that we encountered above. There are sharp cut-off points between cases of composition and non-composition; a single atom really can make the difference between a chair’s existing and not existing. We do not know exactly where the cut-off points will lie, of course, but they will be there somewhere. And there is no pressure on the brute compositionalist to explain why a cut-off point lies where it does, precisely because compositional facts are brute; they admit of no further explanation. As Markosian notes, the brute compositionalist ‘can just shrug and say, “there is no reason. It is a brute fact”’ (Markosian, 1998, 37).

However, there are a number of reasons one might be suspicious about brutal composition. The main reason is that Markosian’s only real motivation for endorsing the view is that it is meant to be the only theory capable of preserving our common-sense intuitions about composition. The problem is, however, that it is not at all clear that it actually does this.

For instance, as James Van Cleve has pointed out, common sense may well point to the fact the composition is restricted, but it also surely points to the fact that there is a reason why it is restricted. (Van Cleve, 2008, 333). Yes, common sense suggests that the Eiffel Tower exists, and that it is composed of iron girders, but it also suggests that there is a reason it exists, namely, that it was purposely built, and that the parts are fixed together in an appropriate way, and so on and so forth. It is not by sheer arbitrary chance that these items compose a tower, or so common sense would have it.

According to brutal composition, there is no reason why some objects compose; that is what it means to say that compositional facts are brute. It therefore follows that the arrangement of the iron girders, and the way they are fixed together, has nothing to do with the fact that they compose the Eiffel Tower. We could dismantle the tower completely, we could fire the girders into the furthest depths of the universe, but according to brutal composition, they would still compose an object. But this could hardly be said to be consistent with our intuitions about composition!

It is for reasons such as this, as well as the more general concern that it seems just too ad hoc, that Brutal Composition has not proved at all popular among those philosophers who have worked on this topic.

e. Concluding Remarks

Because of the problems raised above, the majority of writers on this topic have concluded that compositional restrictivism, in any of its guises, is an untenable position. There are exceptions to this, of course, with van Inwagen, Markosian, and Korman, being notable among them, but these exceptions are undoubtedly in the minority. Our initial intuitions may well point to the fact that composition is restricted, but close philosophical analysis reveals that a principled theory that can accommodate such intuitions seems very difficult, if not impossible, to come by.

But if this majority are correct, and material composition is not restricted, then it means that we are left only with what van Inwagen called the ‘extreme answers’ to the SCQ (van Inwagen, 1990, 72). That is, one must say that composition always occurs (that is, endorse compositional universalism) or say that composition never occurs (that is, endorse compositional nihilism). Of these two options, it is the former that has proved the most popular among contemporary philosophers; indeed, it would probably be fair to say that universalism is the default view. (Although this may be beginning to change: in very recent years, nihilism has begun to grow in popularity.)

One of the main advantages that both universalism and nihilism wield over restrictivism, and one of the main reasons they are the most popular answers to the SCQ, is that they are completely unaffected by the sorites-style arguments articulated above, in section 3c. For neither answer has to state where the cut-off points will lie between cases of composition and cases of non-composition, because neither answer admits that there are such points. According to universalism, there are no cases of non-composition, and according to nihilism, there are no cases of composition, thus neither theory admits the existence of cut-off points.

The following two sections will give an overview of both universalism and nihilism, and the main arguments that have been given for and against them.

4. Compositional Universalism

Compositional Universalism (CU) can be defined as follows:

(CU): for any xs whatsoever (where those xs are material objects), there is a further material object, y, which those xs compose.

For the reader unfamiliar with this debate, it may come as something of a surprise to learn that the view of the informed majority is that compositional universalism is true. The reason for this is that the truth of universalism implies the existence of a vast number of weird and wonderful composite objects. After all, if universalism is true, then for any collection of material objects whatsoever, there will be a further object that they compose. Thus, there will be a material object composed of your favourite shirt, Donald Trump’s hair, and the top half of the planet Mars. And it would turn out that there is, after all, an object composed of the Eiffel Tower and the Great Pyramid of Giza. Universalism is entirely indiscriminate. It matters not how disparate or incontiguous two objects may be; according to universalism, they will compose something. Despite this rather unusual fact, however, universalism remains a popular view.

a. Arguments for Universalism

i. The Argument from Elimination

There is an argument for universalism which seems to hold considerable sway with a number of philosophers, even though it is rarely explicitly stated. It is an argument from elimination, and it consists of two claims. The first claim is that composition is not restricted (based on the type of consideration covered in the previous section), and second claim is that composition clearly occurs in some cases (for example, I exist, and I am composed of parts). The conjunction of these claims is taken to entail the truth of universalism:

  1. Composition is not restricted.
  2. Therefore, composition must either always occur or never occur.
  3. Composition definitely occurs in some cases.
  4. Therefore, composition must always occur.
  5. Therefore, compositional universalism is true.

David Lewis has endorsed precisely this type of argument. He says: ‘no restrictions on composition can serve the intuitions that motivate it. So restriction would be gratuitous. Composition is unrestricted’ (Lewis, 1986, 213). Ted Sider has also advanced an argument similar to this (see Sider, 2001, 120–132. It is interesting to note, however, that Sider has now changed his view and endorses compositional nihilism).

The argument appears clearly valid, but in premise 3, it includes a significant assumption. Many, like Lewis, think that premise 3 is obviously true. Indeed, you will note in the above quote from Lewis that he does not even state anything like premise 3. He jumps straight from the claim that composition is not restricted to the conclusion that it must be unrestricted. The truth of premise 3 must have been so obvious to Lewis as to be not worth mentioning.

However, for many philosophers, premise 3 is not obviously true, and cannot simply be assumed. One reason to think this is that once we reject compositional restrictivism, then we seem to reject most (if not all) of our common-sense intuitions about composition along with it. As such, it looks questionable to make any assumptions about whether composition does or does not occur in any given case. If these assumptions are given up, then the above argument loses its force, and collapses into a mere restatement of the fact that composition is not restricted.

ii. The Argument from CAI

 It has been suggested that composite objects are identical to their parts taken together. That is to say, if a composite object, o, is composed of some parts, the xs, then o is not an additional object to the xs; it just is the xs, taken collectively. This thesis has come to be known as Composition as Identity (CAI), and has its most notable proponent in Donald Baxter, who has provided some compelling examples in its support. For instance:

Someone with a six-pack of orange juice may reflect on how many items he has when entering a ‘six items or less’ line in a grocery store. He may think he has one item, or six, but he would be astonished if the cashier said ‘Go to the next line please, you have seven items’. We do not ordinarily think of a six-pack as seven items, six parts plus one whole. (Baxter, 1988, 579)

The thought is, therefore, that composite objects are identical, in the strict sense of numerical identity, to the parts that compose them. The six-pack literally is the six bottles taken together—nothing more, and nothing less. (See Wallace, 2011, for a nice introduction to the topic of CAI, and Baxter and Cotnoir (eds.) 2014 for more in-depth discussion.)

Moreover, it has also been suggested, by Trenton Merricks, that CAI entails universalism. That is, if CAI is true, then universalism must also be true. CAI, therefore, offers another potential line of argument in favour of universalism (albeit, a line of argument that is dependent on the truth of CAI).

The thrust of the argument is that if composition is identity, then the fusion of any objects just is those objects taken together. So for any objects whatsoever, you automatically get their fusion, because their fusion just is those objects. Trenton Merricks forwards just such a proposal, making the seemingly plausible claim that ‘it seems nonsensical to deny the existence of something that would, if it existed, be (identical with) things whose existence one already affirms’ (Merricks, 2005, 629. It should be noted that Merricks does not endorse universalism, however. Although he does claim that CAI entails universalism, he does not believe that CAI is true). But that is precisely what someone would be doing if they endorsed CAI but did not endorse universalism, or so the argument goes. Therefore, we are led to conclude that if CAI is true, universalism must be true also. To illustrate, consider once more our six-pack of orange juice. First, suppose that you accept, unremittingly, the existence of the six individual bottles of juice. Now according to CAI, the six-pack (the whole) just is the six bottles taken together, nothing more, and nothing less. So given the fact that you accept the existence of the six bottles, you already accept the existence of the six-pack. And the same goes for any collection of objects you can think of. It is as simple as that: CAI entails universalism.

There are two potential problems with the argument from CAI. First, as Ross Cameron has argued, there are reasons to think that the argument is not valid. Cameron’s central point is that CAI is a thesis about the nature of composition (that is, it tells us what composition is—identity), but it does not tell us when composition does and does not occur. For CAI tells us that when there is a composite object, that object is identical to its parts taken together. Furthermore, it tells us that when some objects are, taken together, identical to some single object, then they compose that object. Crucially, however, it does not tell us when some objects are identical to a single object and when they are not. As Cameron says: ‘[CAI] does not tell us whether, given some xs, they in fact compose; it only settles the biconditional: they compose iff there is some one to which they are identical’ (Cameron, 2012, 534). In order for CAI to entail universalism, one must already assume that given any xs whatsoever, there is a single object to which those xs are identical—in other words, that there is a single object which those xs compose. But that is just to beg the question in favour of universalism.

The second problem with the argument is that CAI itself is a highly controversial thesis. Indeed, for many, CAI is not just controversial, but incoherent. The main problem with it is that it seems to twist and contort the standard understanding of the relation of identity to unacceptable extremes. For instance, it appears that CAI violates Leibniz’s law, which states that if x = y, then anything true of x must also be true of y, and vice versa. If CAI is true, then it seems that this principle no longer holds. To see why, consider again our six-pack of juice. CAI says the six-pack is identical to the six bottles of juice. But the six-pack is a single object, whereas the six bottles are six objects. Therefore, it looks like something is true of the six bottles (that is, they are six) which is not true of the six-pack (that is, it is one), which is a violation of Leibniz’s law.

b. Arguments against Universalism

i. The Gratuitousness of Universalism

An often-noted drawback of universalism is that it posits the existence of too many objects. The ontology of the universalist is vast. The reason for this is that universalism states that for any collection of objects whatsoever, there will always be an object which those objects compose. It should be quite clear to see, therefore, that universalism implies the existence of a simply astronomical number of objects. For some, this objection is enough to reject universalism out of hand. Markosian, for instance, claims, ‘there is what seems to me to be a fatal objection to universalism: universalism entails that there are far more composite objects than common sense intuitions allow. […] On the basis of this objection, I reject universalism’ (Markosian, 1998, 22–23).

There are two main strategies universalists employ to overcome this objection. The first, endorsed by the likes of David Lewis and David Armstrong, is to say that, although universalism does posit a vast number of composite objects, this should not count against the theory because these composite objects are taken to be ontologically innocent.

The idea here is that composite objects do not contain any extra matter, over and above their constituent parts, and therefore they somehow come for free, ontologically speaking. Armstrong, for instance, tells us that ‘mereological wholes are not ontologically additional to their parts’ (Armstrong, 1997, 12), whilst Achille Varzi states, ‘the whole and the parts encompass the same amount of reality and should not, therefore, be listed separately in an inventory of the world’ (Varzi, 2000, 285). David Lewis, too, echoes these sentiments by saying ‘it would be double counting to list the cats and then list their fusion’ (Lewis, 1991, 81).

The main problem with this strategy is that the notion of ontological innocence is somewhat mysterious; that is, it is not obviously clear what it is meant to consist in. If a table, for instance, is taken to be ontologically innocent, yet one of the atoms that composes it is not, then are these two entities supposed to exist in the very same sense? If so, then it is not clear why only one of them should ‘count’, ontologically speaking. But if not, then one might think that we need a clearer explanation of what this existential difference actually consists in. This suggests that the notion of ontological innocence is perhaps not informative to the degree really required.

It is perhaps worth mentioning, however, that this objection to ontological innocence loses its force if the proponent of ontological innocence also endorses CAI. After all, if you already accept the existence of some parts, then accepting their fusion does not seem like an extra ontological commitment if it is identical to those very parts. Without the addition of CAI, however, the problem persists.

The second strategy that has been proposed by universalists, in response to the charge of ontological gratuity, is to simply bite the bullet. That is, admit that universalism is not very parsimonious with respect to the number of composite objects it posits, but then deny that parsimony in that respect is particularly important. This line has been taken by Lewis, who makes a distinction between quantitative and qualitative parsimony. Qualitative parsimony is concerned only with the number of types of entity that a theory posits, whereas quantitative parsimony concerns the number of tokens of those types. Lewis has argued that only qualitative parsimony is an important theoretical virtue; once you have admitted a particular type of entity into your ontology (for example, composite objects), then it does not matter how many tokens of that type your ontology contains. Given that most of us already accept the existence of the type—material composite object—then it does not matter that universalism posits a lot of them; this should not count against the theory.

There are two potential sticking points for this response. The first is that some thinkers, such as Daniel Nolan (1997), have argued that quantitative parsimony is in fact a theoretical virtue. If these thinkers are right, then Lewis’s response looks clearly flawed. The second thing to note is that compositional nihilists are likely to remind the universalist that they do not countenance material composite objects at all. Therefore, even if we ignore quantitative parsimony, nihilism has the advantage of being qualitatively more parsimonious than universalism, since it posits one fewer type of thing.

ii. The Counter-Intuitiveness of Universalism

A different objection that is sometimes levelled at universalism is that it flies in the face of common sense. The vast majority of the composite objects that universalism posits are just not the sort of object that common sense would countenance. Think of any collection of objects you like—no matter how random, how disparate, and how disconnected they may be, there will, according to universalism, be a further object they compose. As Lewis (1991, 7–8) reminds us, universalism admits the existence of trout-turkeys: entities composed of the undetached front half of a trout, and the undetached rear half of a turkey. Some may think, therefore, that these sorts of objects simply make universalism too counter-intuitive to be true.

Lewis, however, has a solution ready at hand. He claims that in ordinary thought and talk we restrict the domain of our quantifiers such that they range only over the ordinary objects of common sense, and not over extraordinary, gerrymandered objects such as trout-turkeys. It is only because of this that universalism seems so counter-intuitive.

In Lewis’s defence, we do often use quantifiers in a restricted sense in ordinary communication. For instance, if a mugger stole your wallet, you may tell the police that he stole all your money. But you would not literally mean all your money. (Presumably, the mugger did not empty your bank account and gather all the loose change from the back of your sofa.) What you would have meant, of course, is that the mugger stole all the money you had with you at the time. Thus, you would have been tacitly restricting the domain of your quantifiers such that they ranged only over the contents of your wallet, or perhaps over whatever you had on your person. Once this is recognised, it becomes clear that we actually employ restricted quantification all the time. (Note: that does not actually mean all the time.)

Lewis suggests this is what happens when we talk about composite objects. We tacitly restrict our domain of quantification such that it includes only those composite objects recognised by common sense, and does not include exotic composites like trout-turkeys:

Restrict quantifiers not composition. […] We have no name for the mereological sum of the right half of my left shoe plus the moon plus the sum of all her Majesty’s ear-rings, except for the long and clumsy name I just gave it; we have no predicates under which such entities fall, except for technical terms like ‘physical object’ (in a special sense known to philosophers) or blanket terms like ‘entity’ and maybe ‘thing’; we seldom admit it to our domains of restricted quantification. It is very sensible to ignore such a thing in our ordinary thought and language. But ignoring it won’t make it go away. (Lewis, 1986, 213)

The restricted quantification strategy is quite popular among universalists, but it is not without its problems. A central problem with it is that it looks prima facie implausible (see Korman, 2007). Returning to our example of the mugger, imagine that a particularly meticulous police officer responded to your claim with an arched eyebrow and asked, ‘you really mean he stole all your money; every last penny you owned?’. You may well be exasperated by such a response, but you would probably understand what the officer meant. You would simply have to re-iterate more precisely that you meant the mugger stole all the money that was in your wallet.

But now suppose, on telling the officer that there were precisely two items in the wallet—two twenty-pound notes, say—he were to respond, ‘only two items, you say? But what about the object that those two notes compose? And what about the object composed of the left half of one note and the right half of the other?’. Such a question would not exasperate, but completely befuddle! It seems highly implausible that one might casually respond, ‘Oh, sorry, I didn’t realise you were counting those types of object too’.

What these observations suggest is that although we certainly do restrict our quantifiers in certain circumstances, it usually only takes minimal reflection (or perhaps for someone—like a fussy police officer—to point it out to us) for us to realise, and to accept, that we are doing so. But there is no controversy there—it is just something that we do. In contrast, it appears much more controversial to suggest that we regularly restrict our quantifiers to exclude exotic composite objects. For if you tried to point out to someone that they were doing that, it is unlikely that they would even understand what you were talking about, let alone accept that what you said was true. Moreover, once you had explained what you meant, it is still plausibly unlikely that they would accept what you have said. Much more likely is that they would simply insist that the exotic composites you were attempting to refer to did not exist. Seen in this light, some, like Korman, claim that it stretches the limits of credibility to suggest that, in ordinary thought and talk, we restrict our quantifiers so as to exclude exotic composites.

iii. The Argument from Primitive Cardinality

Juan Comesaña (2008) has presented an argument against universalism based on the grounds that it places unacceptable restrictions on the number of material objects that a world could contain. More technically, it conflicts with a principle that he calls primitive cardinality (PC).

(PC): For any n, there could have been exactly n material things.

PC simply states that there is a possible world containing just one material thing, a possible world containing just two material things, a possible world containing just three material things, and so on and so forth, for every positive integer. Comesaña makes the plausible claim that PC seems obviously true. After all, why could there not be a possible world with just seven material objects in it, for instance, or any other whole number? There seems no good reason to think that this could not be the case.

However, according to universalism, PC is false. For instance, it is impossible, if universalism is true, to have a world in which there are just two material objects. For according to universalism, if you have two objects, you always get a further object that they compose. Thus, it is impossible to have a two-object world, because there will automatically be a third object at such a world: the mereological fusion of those two objects.

Furthermore, universalism does not only rule out the possibility of two-thing worlds, but it also rules out the possibility of four-thing worlds, five-thing worlds, six-thing worlds, eight-thing worlds, and countless more. The reason for this is that with the addition of each individual simple, there will also be the automatic addition of numerous fusions composed of the previously existing simples and the newly added simple. More precisely, for any world with a particular number of simples, n, the total number of material things (that is, simples and fusions) at that world will be 2n-1. Therefore, universalism is incompatible with PC.

How seriously one takes this argument will depend on the strength of one’s conviction in the truth of PC. Comesaña claims that intuition supports the truth of PC. He claims that we have ‘particular pre-theoretical judgments that there could have been exactly two things, and exactly three things, and…’, whereas universalism is supported only by abstract and theoretical principles. Moreover, he claims that it is ‘standard methodological procedure’ in many areas of philosophy to give precedence to pre-theoretical judgements over general theoretical principles, when they conflict. Because of this, he claims that this constitutes prima facie evidence in favour of PC (Comesaña, 2008).

The argument from PC is unlikely to be considered as fatal to universalism. After all, the universalist can just bite the bullet and admit that it is simply a consequence of the theory that PC is rendered false. This may well violate an intuition we have, but it is not clear how strong an intuition that is in the first place. Moreover, if compositional restrictivism is false, we have already had to concede that many of our intuitions about material objects are false, so one further concession may not be that hard to take.

Finally, the universalist can remind us that although her theory renders PC, as stated above, as false, it is perfectly compatible with a similar principle, that one could call the primitive cardinality of simples (PCS).

(PCS): For any n, there could have been exactly n simples.

Universalism is perfectly compatible with PCS, and, indeed, it may well be PCS, not PC, that our pre-theoretical judgements are driving at.

iv. The Identity Argument

One final argument against universalism suggests that the universalist owes us some answers to some particularly tricky questions concerning the identity of composite objects. The argument was originally proposed by van Inwagen (1990, 75), but the version presented below is a modified, somewhat more neutral, version than his.

The argument rests on the fact that according to universalism, any collection of objects composes a further object, regardless of any facts concerning those objects’ nature, their locations, or the spatial or causal relations that hold between them. Indeed, according to universalism, it is enough that two objects merely exist, that they compose a further object. No other conditions need be satisfied.

Given this fact, the argument can be set up as follows. Consider an ordinary composite object, let us say, a tree, and let us call this tree, ‘Spruce’. According to universalism, Spruce is a composite object and is composed of a large number of simples (sub-atomic particles, or what-have-you) that are arranged in a tree-like fashion. If we call the fusion of those simples, ‘F’, we can say that Spruce = F.

Now suppose that a bolt of lightning were to strike Spruce and vaporise it. The force of the bolt was such that Spruce was completely destroyed, and all her constituent simple parts were scattered far and wide throughout the surrounding area.

In this eventuality, it would seem quite clear that Spruce no longer exists. If you were to look at the exact spot of the incident, there would be no tree present. However, F does still exist. The simples that composed Spruce have not been destroyed but merely rearranged, scattered far and wide. But according to universalism, their spatial location does not affect their compositional status—they still compose the very same fusion they composed before. Thus, we now have a situation in which F exists, but Spruce does not. But this contradicts our earlier claim that Spruce = F. For if x = y, it is impossible for x to exist although y does not.

The upshot of this argument is that although universalism does posit lots of mereological fusions (like F), these fusions are clearly not the ordinary objects of common sense (like Spruce). This is because these fusions are virtually indestructible—you can scatter their parts to the furthest corners of the known universe, and they will still exist. But the same cannot be said of trees, like Spruce, or indeed any ordinary objects of common sense. In light of all this, it looks like the universalist needs to answer two particularly difficult questions:

  1. What are ordinary objects, if not mereological fusions of simples?
  2. Why should we accept the existence of all these peculiar mereological fusions, if they do not include, after all, the ordinary objects of common sense we thought they did?

There are a couple of ways in which the universalist could respond to this argument. The first is to endorse a relation of constitution, and the second is to endorse four-dimensionalism.

On the first option, the universalist would deny the premise in the argument that states Spruce = F. Instead, ordinary objects are not taken to be identical to mereological fusions, but constituted by mereological fusions. (Recall the discussion of material constitution in section 1b). According to this view, one can say that F constitutes Spruce whilst its parts are arranged in a tree-like fashion, but when the parts are spread far and wide, after the lightning bolt, F no longer constitutes Spruce.

Although this view certainly overcomes the argument, it leaves many questions unanswered. First and foremost, what is this relation of constitution meant to be? Moreover, if it is the case that F constitutes Spruce at some points of its existence but not others, it implies that constitution is restricted (in the same sense that composition was taken to be restricted in section 3). But this seems to leave the view open to the sorites-style arguments we encountered earlier, that is, where will the cut-off points lie between cases of constitution and cases of non-constitution? It also seems to invite a question similar to the SCQ, that we could call the Special Constitution Question; that is, under what conditions does some object, o, constitute an F? This question may well prove to be just as difficult to answer as the original SCQ.

The second option for the universalist would be to endorse four-dimensionalism: the view which states that material objects are extended through time, in much the same way they are extended through space. Hence, material objects are four-dimensional (extended in the three dimensions of space, and the fourth dimension of time). As such, material objects have not only spatial parts, but also temporal parts.

A consequence of this view is that objects are not wholly present at any particular moment of time. Rather, they merely have a temporal part that is wholly present. To illustrate, consider an analogy. The river Thames is not wholly present at London Bridge. Rather, only a part of the river exists there. The entire river stretches all the way from the Cotswolds to the North Sea. In the same way, four-dimensionalists would say that the river Thames is not wholly present at any given time. Rather, only a (temporal) part of it is. The whole river stretches (temporally) all the way from that moment of time at which it came in to existence, to that moment of time at which it will cease to be.

Interestingly, just like the constitution theorist, the four-dimensionalist will deny the premise which claims Spruce = F, but for very different reasons. Instead, Spruce and F are taken to be distinct, four-dimensional objects that merely share some temporal parts (in the way that two distinct streets could share some spatial parts, at the region at which they cross one another). Specifically, they share the temporal part at which all the parts of F are arranged in a tree-like manner. So according to this view, there are not two distinct objects located in the same place at the same time. Rather, at t, there is a single object present, which is a (temporal) part of two distinct objects, Spruce and F.

Each of these two responses does enough to overcome the identity argument, but they both represent a cost to the universalist. Thus, although it is not insurmountable, the identity argument seems to show that accepting universalism (which is already a controversial metaphysical thesis) forces one into accepting at least one other controversial metaphysical thesis: either the constitution view or four-dimensionalism. This is unlikely to be considered a fatal cost, but it is a cost that must be recognised nonetheless.

5. Compositional Nihilism

The remaining answer to the SCQ is compositional nihilism (CN):

(CN): for any xs (where those xs are material objects), there is never a further material object which those xs compose.

More simply put, according to nihilism, there are no material composite objects at all; all material objects in existence are mereologically simple.

Nihilism, on the face of it at least, is even more radical than universalism. Think of any object at all that you consider to be composite, that is, to have parts. According to the nihilist, it does not exist. For the nihilist, there are no tables, there are no buildings, there are no planets or stars. There are not even any human beings. (That is, so long as you take such entities to be composite). For this reason, nihilism is often dismissed as obviously false. Any theory which entails the view that there are no human beings is obviously false, or so one might well be tempted to think. However, nihilism has recently been growing in popularity and has been defended in print by a number of philosophers (for instance, Cameron, 2010; Sider, 2013; Cornell, 2017). These philosophers tend to claim that these supposedly absurd consequences of the view (for example, that there are no human beings) are not, in fact, as absurd as they may seem. Once the view is properly understood, they maintain, these apparent absurdities can easily be explained away.

a. Arguments for Nihilism

i. The Causal Overdetermination Argument

One type of argument that has proved to be quite influential in the debate over material composition is that which suggests we should reject the existence of composite objects because, if there were any such things, they would be causally redundant.

Causal redundancy arguments of this ilk are probably more familiar within the philosophy of mind, as they have often been employed in support of physicalism. The idea is that we can give a full causal explanation of human action in terms of the physical states and processes that occur in the brain. As such, there is no need to posit any non-physical, mental entities, as such things would have no causal role to play; they would be causally redundant. (See Kim, 1993.) A similar type of argument can be formulated in support of nihilism. That is, we can give a full causal explanation of any physical event solely by appealing to the microphysical particles involved, their properties, and the relations in which they stand. Thus, there is no need to posit any macroscopic, composite objects, because such things would have no causal role to play; they would be causally redundant.

Trenton Merricks has provided the clearest, and most forceful, version of this argument (Merricks, 2001). (Although it should be made clear that Merricks only uses the argument to support a quasi-nihilistic view rather than a full-blown compositional nihilism. He does, for instance, allow that human beings exist and are material composite objects. However, he rejects the existence of all inanimate composite objects). Central to Merricks’s argument is the notion of causal overdetermination. Causal overdetermination occurs when there are multiple, individually sufficient, causes for an event. That is, when an event has more than one cause, each of which would have been fully sufficient, on its own, to bring that event about. It is widely agreed that causal overdetermination is objectionable, and that we should avoid endorsing any theories which involve it (see, for instance, Bunzl, 1979; Loeb 1974; Kim, 1993). Merricks seizes on this claim and uses it to argue against inanimate material composite objects.

To see how the argument works, consider Merricks’s example of a baseball smashing a window. The thought is that the activity of the atoms which are taken to compose the baseball is quite enough on its own to give a complete causal explanation of the shattering of the window. Therefore, if there exists a baseball in addition to the atoms, then that baseball cannot play any causal role in the shattering of the window—if it did, the shattering of the window would be causally overdetermined. Thus, we must therefore conclude that baseballs (and, by extension, all other material composite objects), if they were to exist, would have no causal powers at all.

Merricks completes the argument by making the seemingly plausible claim that material composite objects, like baseballs, surely would have causal powers if they existed. A baseball, if it existed, would be a physical object, with physical properties such as mass and so on and so forth. Thus, it would be implausible to suggest that such a thing would be causally inert; indeed, such a suggestion may well contravene basic laws of physics. As such, he argues, we have no option but to conclude that material composite objects, like baseballs, do not in fact exist.

There are ways in which the argument can be resisted. The most straightforward way of doing so is to simply allow that physical events (like the shattering of windows by baseballs) are in fact causally overdetermined, that is, that they are caused by composite objects and by the constituent parts of those objects. Allowing this would certainly undermine Merricks’s argument, but at the same time, it would also entail that there is widespread and systematic causal overdetermination in the world. For most, however, this conclusion is simply too unpalatable to accept.

A more sophisticated response has been offered by Amie Thomasson, who suggests that the argument is flawed because it is based on the incorrect assumption that composite objects are separate and independent entities from the simple parts of which they are composed (Thomasson, 2006). Thomasson accepts that causal overdetermination is highly objectionable, but only in those cases in which the two overdetermining causes are completely separate and independent from one another. (To take a well-used example, consider a person executed by firing squad, who is hit by two bullets at exactly the same time, each one of which was fully sufficient to kill them.) Thomasson claims, however, that composite objects are clearly not separate and independent from their constituent parts, thus the worry about causal overdetermination is misplaced.

Thomasson certainly has a point that there is a particularly intimate connection between a composite object and its parts. They are not separate and independent in the same way that the two bullets in the firing squad example are. It would be impossible to throw the baseball, for instance, without also throwing its constituent parts. However, providing that one does not endorse CAI, the baseball must be considered a distinct object from the parts that make it up. As a result, it is not obvious as to just how concerned we should be about the claim that both the baseball and its constituent parts have causal powers.

ii. The Problem-Solving Argument

Another point in favour of compositional nihilism is that it provides a straightforward solution to a number of long-standing problems generated by ordinary material objects. For instance, in section 1b, we considered the problem that arises when one considers a statue and the lump of bronze (or clay, or whatever) that it is made of. The puzzle emerges because it looks like we need to say that the statue and the lump of bronze are distinct objects, as they have different properties (for example, the lump existed before the statue did, and it would survive being squashed into a ball, whereas the statue would not). But this leads to the seemingly bizarre conclusion that we have two distinct objects (a statue and a lump of bronze) occupying exactly the same space at exactly the same time. Other recalcitrant problems which are similar include the Ship of Theseus, the case of Tibbles the cat (see Wiggins, 1968), and the problem of the many (see Unger, 1980).

Various potential solutions have been offered to these problems, but most of them involve the acceptance of some controversial metaphysical thesis or other, such as four-dimensionalism, or the constitution view. The compositional nihilist, however, avoids all these problems in their entirety. This is because, according to the nihilist, there are no composite objects at all. There are no statues, and there are no lumps of bronze, thus the question of how these things relate to one another never arises. Likewise, the nihilist does not have to worry about the problems of the Ship of Theseus or of Tibbles the cat, because there are no such things as ships or cats.

Compositional nihilists often point to this fact as providing support to their view: it offers a simple and elegant way of dissolving (or, rather, avoiding) all the problems generated by material constitution. Indeed, the nihilist could well go one step further and say that the only reason these puzzles have arisen at all is that we have been mistakenly assuming that statues/lumps/ships/cats/ and so forth exist in the first place. The puzzles are a direct product of a confused and fallacious understanding of the world. Once we understand the true nature of the world (that is, that there is no such thing as material composition), the puzzles never even get off the ground.

The obvious counter-response to this argument, however, is that compositional nihilism is a far more extreme and controversial metaphysical thesis than any of those which are invoked to solve the problems of material constitution. On this view, therefore, the nihilist cuts off their nose to spite their face. Sure, nihilism might avoid these philosophically puzzling problems of material constitution, but it does so at the exorbitant cost of denying the existence of any ordinary material objects whatsoever. This, for some, is just far too high a cost to pay.

iii. The Argument from Ideological Parsimony

Ted Sider has recently put forward an argument for compositional nihilism based on what he calls ‘ideological parsimony’ (Sider, 2013). The argument appeals to a distinction, originally made by Quine, between a theory’s ‘ontology’ (which consists of the objects the theory posits) and its ‘ideology’ (which consists of the primitive, or unexplained, terms or notions the theory employs).

Arguments that appeal to ontological parsimony (that is, arguments which suggest one theory is better than another because it posits fewer objects) are fairly commonplace in philosophy. But Sider claims that similar arguments can be made which appeal to ideological parsimony (that is, arguments which claim one theory is better than another because it employs fewer primitive terms). Sider’s claim is that nihilism is not only ontologically more parsimonious than universalism (it posits fewer objects—only simples, and no composites), but it is also more ideologically parsimonious, because it can completely do away with the notion of parthood and the related mereological terms and concepts that go with it. The general idea is that this makes nihilism an ideologically simpler theory than universalism (or, indeed, than any theory that accepts the existence of any composite objects), and this should count in its favour.

One way to respond to Sider’s argument would be to accept it in spirit, but to question its strength. That is, it is not obvious how much weight one should afford the notion of ideological parsimony in the first place. If one was unconvinced, then it may seem that the advantage offered by nihilism in this regard was marginal at best. But for those who place great value on ideological parsimony, by contrast, the argument might have considerable power. The jury, it seems, is still out on this issue (although see Cowling, 2013, for a defence of the virtues of ideological parsimony). A final point worth considering here is that one may well think that any advantage that nihilism gains in ideological parsimony is going to be outweighed by the various costs it incurs, such as the fact it denies the existence of ordinary composite objects, like tables, chairs, and human beings.

b. Arguments against Nihilism

i. The Common-Sense Argument

By far the most common objection to compositional nihilism in the extant literature is one that appeals to common sense. It is simply obvious that composite objects exist, thus it is simply obvious that nihilism is false, or so the argument goes. This view is shared by many eminent thinkers (such as Markosian, 1998, 221; Schaffer, 2009, 358), and is perhaps best summed up by Michael Rea, who says: ‘it is just obvious that there are tables, chairs, computers and cars. The fact that some philosophical arguments suggest otherwise seems simply an indication that something has gone wrong with those arguments’ (Rea, 1998, 348).

This kind of argument shares many similarities with G. E. Moore’s famous, hand-raising, refutation of idealism. Essentially, the idea is that we can be far more certain of the common-sense fact that tables, chairs, and other composite objects exist, than we can of any of the abstract and theoretical premises employed in arguments for nihilism. Therefore, common sense should win out—we should accept the existence of ordinary composite objects and conclude that nihilism, regardless of any theoretical advantages it may offer, is false.

Those attracted to compositional nihilism have employed a number of different strategies to combat this objection. A theme that is common to many of them is that the common-sense objection is simply misjudged. That is, it misunderstands, or misconstrues, precisely what nihilism actually states. The point, which has been made by a number of contemporary thinkers (for instance, Sider, 2013; Cornell, 2017), is that although nihilism does deny the existence of ordinary objects like tables and chairs, it does not deny the existence of the physical matter that allegedly composes those objects. Once this fact is recognised, nihilism does not, in fact, violate our common-sense intuitions in the objectionable way it is often claimed to.

As an example, consider an ordinary composite object: a house. According to common sense, this house is made up of many parts. At base, these parts will be very small indeed, that is, some kind of sub-atomic particles or whatever our scientific theories tell us are the fundamental constituents of matter. The point is that the nihilist accepts the existence of all these sub-atomic particles. All she denies is that these particles compose some single, composite object: a house. When seen like this, the common-sense objection seems to lose some of its bite. As Cian Dorr has observed:

If all the plates in my kitchen dresser were to cease to exist, but all the molecules in my dresser were to stay arranged exactly as they are, I wouldn’t care very much. My guests would have no new reason to worry about their food getting all over the tablecloth. In fact, they would never know unless I told them—but come to think of it, I would never know either. (Dorr, 2002, 42–43)

In light of all this, there appears to be a question mark over just how much of a conflict there really is between compositional nihilism and common sense. Taken at face value, with its outright denial of all composite objects, nihilism seems about as controversial a theory as one could wish for, but once it is recognised that nihilism still acknowledges the matter that is taken to compose these composite objects, then the power of the common-sense objection seems to wane.

ii. The Argument from Emergence

A further argument against compositional nihilism is based on what has been called the problem of emergence. In its basic form, the argument begins with the claim that nihilism is incompatible with the existence of emergent properties. It then goes on to say that since there are very good reasons to believe that there are emergent properties, there are equally good reasons to think that nihilism must be false. The beginnings of an argument like this can be found in van Inwagen (1990) and Merricks (2001), but perhaps its clearest articulation is in Schaffer (2007).

To appreciate the force of the argument, one first has to understand what emergent properties actually are. An emergent property is a property of an object or system that cannot be explained or accounted for solely by the properties of that objects parts. That is to say, emergent properties are taken to be things that are somehow over and above a mere combination of the properties and relations of their bearer’s base constituents. Most familiar properties are clearly not emergent in this sense. Take mass, for instance. Mass, like most properties, is reducible; one can explain the mass of an object or system reductively, in terms of the mass of each of its constituent parts. (For example, the mass of a 100kg pile of bricks can be explained or accounted for solely by the fact that each of the one hundred bricks in the pile has a mass of 1kg.)

Emergent properties, by contrast, resist this kind of reduction. If a property, F, of an object or system is emergent, then it cannot be explained or accounted for solely by an appeal to the properties and relations of its constituent parts. To illustrate, consider the water in a swimming pool. It has the property of being wet. But none of the individual H2O molecules that make it up have that property (a single molecule is not wet). Thus, the property of wetness seems to emerge at the macro-level, and it cannot be reduced to a mere aggregation of properties and relations at the micro-level. (Note: this is just an illustration. It is far from clear whether wetness is, in fact, a genuinely emergent property.)

The most common arena in which emergent properties are postulated is the philosophy of mind and consciousness. The thought is that mental states—something like an excruciating pain, or a sharp pang of guilt, for example—are so entirely distinct in character from the electro-chemical, neurological properties that are instantiated by parts of the brain, that they cannot be explicable purely in terms of those properties. (Just like a single molecule of water is not wet, a single cell in the brain does not feel pain/guilt/love/and so forth.) They may well be caused by activity in the brain, but they emerge holistically as being far greater than the sum of their causal beginnings.

Another quite distinct field in which emergence plays a prominent role is quantum mechanics. Very roughly, the thought is that certain composite quantum objects or systems (often referred to as ‘entangled systems’) can exhibit properties that are quite inexplicable in terms of the object’s/system’s sub-atomic constituents alone (see Schaffer, 2007, for more details of emergence in quantum physics).

With this understanding of emergent properties in hand, it is only a short step to see why they cause such a problem for the nihilist. The reason is that emergent properties seem to imply a stratified picture of the world, whereby reality is divided up into levels of mereological complexity. At the base level, you have the mereological simples, and then you have higher levels populated by the composite objects those simples compose. Emergent properties are those which emerge at higher levels than the base level—that is, which are instantiated by composite objects—and which cannot be explained purely by appealing to the objects and properties at the base level. The problem for the nihilist is that they deny this stratification of reality: there is the base level and nothing else. As such, there are no candidate objects in the nihilist’s ontology that could have emergent properties. Quite simply, there is nowhere for emergent properties to emerge.

If this is right, and nihilism is incompatible with emergent properties, then given that there are good reasons to think that emergent properties do exist (and both quantum physics and philosophy of mind suggest that there are such reasons), these reasons also seem to suggest that nihilism is false.

There appear to be three possible ways in which the nihilist could respond to this charge. The first strategy would be to simply reject the possibility of emergent properties. But since this would conflict with popular views in both quantum physics and philosophy of mind, it is not a particularly attractive route to take. The second strategy would be to endorse a fairly radical form of compositional nihilism—known in the literature as existence monism—which claims that there is only a single material object in existence, the world itself, and it is mereologically simple (that is, has no parts). Schaffer (2007) argues that this is the only way for the nihilist to overcome the problem of emergence (although it should be noted that Shaffer himself does not endorse existence monism). The problem with this strategy is that existence monism is considered by many as being even more extreme and implausible than standard nihilism. Shaffer himself, for instance, labels it a ‘crazy view’. However, see Horgan and Potrč (2008), or Cornell (2016), for recent defences of monism.

The final, and most promising, strategy would be to argue that nihilism is in fact compatible with emergent properties. This strategy has been put forward by Caves (2018) and Cornell (2017), who both argue that simples can collectively instantiate emergent properties, even if none of them individually instantiate that property, and that no composite objects are required in order for this to be possible. Therefore, emergent properties (such as mental states) can still emerge at the macro-level, even though there are no composite objects at that level to instantiate them.

iii. The Problem of Atomless Gunk

According to our current scientific theories, physical matter bottoms out at a ‘base level’. For instance, an ordinary object, like a table, is made of molecules; those molecules are made of atoms, and those atoms are made of even smaller parts such as leptons and quarks. However, that, we are told, is as far as we can go. Leptons and quarks themselves have no smaller parts. They are fundamental particles; they are simple; they represent the ‘bottom layer’ of reality.

But what if this view was wrong? What if the particles that we currently think to be fundamental are in fact made of even smaller parts? This is surely a live possibility. After all, before we discovered the existence of sub-atomic particles, it was presumed that atoms themselves were the smallest constituents of reality. (Indeed, the term ‘atom’ was used precisely because it is derived from the Greek for ‘indivisible’.) We were wrong then, so we could surely also be wrong now.

Some have suggested, however, that it is possible that there may not be a ‘base level’ at all. That is, matter could be infinitely divisible. Another way of saying that is that for any bit of physical matter you choose, all of its parts will have further parts. This rather exotic type of physical matter was labelled by David Lewis as ‘atomless gunk’ (Lewis, 1991, 20), although it is more commonly referred to now as plain ‘gunk’.

The possibility of gunk represents a threat to nihilism. The reason for this is that according to nihilism, the only material objects that exist are simples (that is, objects with no parts). But if matter were ‘gunky’, then it would turn out that there were no simples at all (because every part of gunky matter has further parts—there are no simple parts of gunk). Therefore, if matter were gunky, the nihilist would be committed to saying that there were, in fact, no material objects in existence at all.

The most common response to this problem is to deny flat-out that gunk is a real possibility. It may seem as though it is possible, in the sense that we can conceive of such stuff without running into any obvious contradiction, but this appearance is illusory. Gunk is not possible, and matter must bottom out at some point, thus nihilism is preserved. See Williams (2006) for a defence of this approach.

6. Deflationism

So far, it has been suggested that all answers to the SCQ must fall into one of the three categories: restrictivism, universalism, or nihilism. However, there is in fact a fourth way in which one could respond to the question: to dismiss it altogether. This kind of response has been articulated by a number of philosophers, who have dismissed the SCQ for a variety of reasons. These views fall under the more general heading of ‘deflationism’, as they attempt to ‘deflate’ the importance of the debate over material composition.

Some examples of such deflationist views include that of Amie Thomasson, who claims the SCQ to be an unanswerable question (Thomasson, 2006), and that of Jonathan Schaffer, who takes the existence of composite objects to be obvious and trivial (Schaffer, 2009). But the most influential deflationary account is that of Eli Hirsch, and is discussed below.

a. Hirsch and Quantifier Variance

Hirsch argues that the debate over material composition is not a genuinely ontological debate, but rather, merely a verbal dispute (see Hirsch, 2005). What this means is that when a compositional nihilist argues with a compositional universalist about whether there are any tables, for instance, they are merely talking past one another rather than having a genuine disagreement about what things exist. The source of confusion is that they are using the same words to mean different things. In slogan-like fashion: they agree about the facts, they disagree about the semantics.

More specifically, Hirsch has proposed a theory—that has its roots in the thought of Rudolf Carnap—of ‘quantifier variance’, whereby different speakers use quantifiers (that is, quantificational expressions, such as ‘exists’, and ‘there is’) with different meanings. Thus, when a universalist says ‘tables exist’, and a nihilist responds, ‘tables do not exist’, they are both in fact speaking the truth, but merely taking the term, ’exist’, to mean different things. Central to Hirsch’s view, moreover, is that there is no correct or privileged way to use quantificational language. There are many different ways in which one can describe reality (for example, ways in which mereological fusions of the Eiffel Tower and the Great Pyramid at Giza are said to exist, and ways in which they are not), but none of these ways are any more correct than any other. The upshot is that disputes like that between the nihilist and the universalist arise because the two parties are speaking different (albeit similar) languages. They are, at base, disputes about the meaning of words, not about the nature of reality.

There is strong opposition to Hirsch’s view, however, largely because it is often supposed to involve a radical anti-realism about the nature of reality. In short, if there is no correct way in which to describe reality, then it seems to follow that reality does not have an objective nature at all (for if it did, one could describe it rightly or wrongly). Many thinkers claim that reality does have an objective nature. What this means is that there is a correct way to describe the world, and that some descriptions are better (that is, more accurate) than others. Ted Sider, in his 2011 book, Writing the Book of the World, gives a comprehensive defence of this kind of view.

This debate over the legitimacy, or substantiality, of the SCQ is just a small part of the larger debate between ontological realists and ontological anti-realists. The former camp, including the likes of Sider, maintain that the ontological questions that metaphysicians often concern themselves with (concerning disputed entities such as composite objects, temporal parts, possible objects, abstract objects, universals, and so on) are important and substantive, and need to be answered satisfactorily. The latter camp, by contrast, including the likes of Hirsch, argue that these disputes are, for a variety of reasons, either defective, or unimportant. This debate remains unresolved, and takes centre stage in the currently burgeoning field of metametaphysics. (see Chalmers, Manley, and Wasserman (eds.) 2009).

7. References and Further Reading

  • Armstrong, D. M. (1997) A World of States of Affairs (Cambridge: CUP).
  • Baxter, D. (1988) ‘Identity in the Loose and Popular Sense’, Mind, 97, 388, pp.575–582.
  • Baxter, D. and Cotnoir, A. (eds.) (2014) Composition as Identity (Oxford: OUP).
  • Bunzl, M. (1979) ‘Causal Overdetermination’, The Journal of Philosophy, 76, 3, pp.134–150.
  • Cameron, R. (2010) ‘How to Have a Radically Minimal Ontology’, Philosophical Studies, 151, 2, pp.249–264.
  • Cameron, R. (2012) ‘CAI Doesn’t Settle the SCQ’, Philosophy and Phenomenological Research, 84, 3, pp.531–554.
  • Carmichael, C. (2011) ‘Vague Composition Without Vague Existence’, Nous, 45, 2, pp.315–327.
  • Carmichael, C. (2014) ‘Toward a Common Sense Answer to the Special Composition Question’, Australasian Journal of Philosophy, 93, 3, pp.475–490.
  • Caves, R. (2018) ‘Emergence for Nihilists’, Pacific Philosophical Quarterly, 99, 2–28.
  • Chalmers, D., Manley, D., and Wasserman, R. (2009) Metametaphysics: New Essays on the Foundations of Ontology (Oxford: OUP).
  • Comesaña, J. (2008) ‘Could There Be Exactly Two Things?’, Synthese, 162, 1, pp.31–35.
  • Cornell, D. (2016) ‘Taking Monism Seriously’, Philosophical Studies, 173, 9, pp.2397–2415.
  • Cornell, D. (2017) ‘Mereological Nihilism and the Problem of Emergence’, American Philosophical Quarterly, 54, 1, pp.77–87.
  • Cowling, S. (2013) ‘Ideological Parsimony’, Synthese, 190, 17, pp.3889–3908
  • Dorr, C. (2002) The Simplicity of Everything, PhD Thesis, University of Oxford, England.
  • Goldstein, L. (2012) ‘The Sorites Is a Nonsense Disguised by a Fallacy’, Analysis, 72, 1, pp.61–65.
  • Hawley, K. (2006) ‘Principles of Composition and Criteria of Identity’, Australasian Journal of Philosophy, 84, 4, pp.481–493.
  • Hestevold, S. (1981) ‘Conjoining’, Philosophy and Phenomenological Research, 41, 3, pp.371–385.
  • Hirsch, E. (2005) ‘Physical-Object Ontology, Verbal Disputes, and Common Sense’, Philosophy and Phenomenological Research, 70, 1, pp.67–97.
  • Horgan, T. and Potrč, M. (2008) Austere Realism (London: MIT Press).
  • Kim, J. (1993) Supervenience and Mind (Cambridge: CUP).
  • Korman, D. (2007) ‘Unrestricted Composition and Restricted Quantification’, Philosophical Studies, 140, 3, pp.319–344.
  • Korman, D. (2015) Objects: Nothing Out of the Ordinary (Oxford: OUP).
  • Leonard, H. S. and Goodman, N. (1940) ‘The Calculus of Individuals and Its Uses’, Journal of Symbolic Logic, 5, pp.45–55.
  • Lesniewski, S. (1916), ‘Foundations of the General Theory of Sets’ in Lesniewski, S. Collected Works, eds. S. J. Surma, J. Srzednicki, D. I. Barnett, and F. V. Rickey, trans. D. I. Barnett (Dordrecht, Kluwer, 1992) vol. 1, pp.129–173.
  • Lewis, D. (1986) On the Plurality of Worlds (Oxford: Basil Blackwell).
  • Lewis, D. (1991) Parts of Classes (Oxford: Basil Blackwell).
  • Loeb, L. E. (1974) ‘Causal Theories and Causal Overdetermination’, The Journal of Philosophy, 71, 15, pp.525–544.
  • Markosian, N. (1998), ‘Brutal Composition’, Philosophical Studies, 92, 3, pp.211–249.
  • Merricks, T. (2001) Objects and Persons (Oxford: OUP).
  • Merricks, T. (2005) ‘Composition and Vagueness’, Mind, 114, 455, pp.615–637.
  • Nolan, D. (1997) ‘Quantitative Parsimony’, The British Journal for the Philosophy of Science, 48, 3, pp.329–343.
  • Rea, M. (1998) ‘In Defence of Mereological Universalism’, Philosophy and Phenomenological Research, 58, 2, pp.347–360.
  • Rea, M. (ed.) (1997) Material Constitution: A Reader (Oxford: Rowman & Littlefield).
  • Schaffer, J. (2007a) ‘From Nihilism to Monism’, Australasian Journal of Philosophy, 85, 2, pp.175–191.
  • Schaffer, J. (2009) ‘On What Grounds What’ inMetametaphysics: New Essays on the Foundations of Ontology, edited by David J. Chalmers, David Manley and Ryan Wasserman. (Oxford: OUP), chapter 12.
  • Sider, T. (2001) Four-Dimensionalism (Oxford: OUP).
  • Sider, T. (2011) Writing the Book of the World (Oxford: OUP).
  • Sider, T. (2013) ‘Against Parthood’, in Bennett, K. and Zimmerman, D. (eds.) Oxford Studies in Metaphysics, vol. 8. (Oxford: OUP), pp.237–293.
  • Silva, P. (2013) ‘Ordinary Objects and Series-Style Answers to the Special Composition Question’, Pacific Philosophical Quarterly, 94, 1, pp.69–88.
  • Simons, P. (1987) Parts: A Study in Ontology (Oxford: Clarendon).
  • Thomasson, A. (2006) ‘Metaphysical Arguments against Ordinary Objects’, The Philosophical Quarterly, 56, 224, pp.340–360.
  • Unger, P. (1979) ‘There Are No Ordinary Things’, Synthese, 41, pp.117–154.
  • Unger, P. (1980) ‘The Problem of the Many’, Midwest Studies in Philosophy, 5, 1, pp.411–468.
  • Van Cleve, J. (2008) ‘The Moon and Sixpence: A Defence of Mereological Universalism’, in Hawthorne, J., Sider, T., and Zimmerman, D. (eds.) Contemporary Debates in Metaphysics (Oxford: Blackwell), pp.321–340.
  • Van Inwagen, P. (1987) ‘When Are Objects Parts?’ Philosophical Perspectives, 1, pp.21–47.
  • Van Inwagen, P. (1990), Material Beings (Ithaca: Cornell UP).
  • Varzi, A. (2000) ‘Mereological Commitments’, Dialectica, 54, pp.283–305.
  • Wallace, M. (2011) ‘Composition as Identity’, parts I and II, Philosophy Compass, 6, 11, pp.804–827.
  • Wiggins, D. (1968) ‘On Being in the Same Place at the Same Time’, Philosophical Review, 77, 1, pp.90–95.
  • Williams, J.R.G. (2006) ‘Illusions of Gunk’, Philosophical Perspectives, 20, 1, pp.493–513

 

Author Information

David Cornell
Email: DMCornell@uclan.ac.uk
University of Central Lancashire
United Kingdom

Chinese Philosophy: Overview of History

There was no effort to write a comprehensive history of the Chinese Philosophy until the modern period of Western influence on Chinese culture. This is not to say that Chinese thinkers did not engage selectively with philosophers of earlier or contemporary eras.

What has come down to us as the final chapter of the Zhuangzi (ch. 33, Tian Xia “Under Heaven”) offers a sort of history of the development of Chinese philosophy. Of the writers of texts that survive to this day, it was Sima Tan (165?-110 B.C.E.) who made the first real attempt to classify Chinese thinkers into six major schools: Yin-Yang, Confucianism (Rujia), Mohism (Mojia), the School of Names (Mingjia), Legalism (Fajia), and Daoism (Daojia). As the history of Chinese philosophy evolved, more categories were added to these six, as well as various permutations and blends of them (for example, Profound Learning/Xuanxue and Neo-Confucianism/Lijia).

Hu Shi’s An Outline of the History of Chinese Philosophy (1919) is the first work by a Chinese scholar to undertake the project of writing a comprehensive history of the transformations of Chinese philosophical thought, although it is presented by the author as only an outline. Feng Youlan (Fung Yu-lan, 1895-1990) wrote the most widely known and used work on the history of Chinese philosophy in the 20th century. His two-volume History of Chinese Philosophy (volume 1, 1931 and volume 2, 1934) is a landmark work having a range and depth far exceeding that of Hu Shi’s Outline. Lao Siguang‘s History of Chinese Philosophy in 1982 makes it quite clear that his intention was to write a work that made use of Western critical standards in all respects. One of the most thorough and well-informed studies of the history of Chinese philosophy in a single volume is The History of Chinese Philosophy, edited by Bo Mou.

It is a common characterization of the history of Chinese philosophy to say that its overall trajectory may be captured in the concept of “the three teachings” (sanjiao): Confucianism, Daoism, and Buddhism. If we acknowledge the numerous permutations, revisions, re-conceptualizations, and syntheses of them, and if we speak of the three teachings as analogous to streams of influence flowing together into the broad river of Chinese philosophy, then this is still a fruitful way of conceiving of the major historical forces at work in the tradition, from at least the 3rd century C.E. down at least to the modern period. Beginning in the late 18th century, Western philosophical influences began to flow into the stream of Chinese philosophy, as well.

Table of Contents

  1. Classical Chinese Philosophy in the Pre-Qin Period (before 221 B.C.E.)
    1. The “Great Commentary (Da Zhuan)” to the Classic of Changes (Yijing)
    2. Confucius (551-479 B.C.E.) of the Analects
    3. Mozi (c. 470-391 B.C.E.) and Mohism
    4. The School of Names, Mingjia 名家 (Disputers, Dialecticians, Bianshi)
    5. The Daodejing
    6. The Zhuangzi
    7. Mencius (c. 372-289 B.C.E.)
    8. Xun Kuang or Xunzi (c. 325?-235? B.C.E.)
  2. Philosophy from the Qin (221 B.C.E.) to the Tang (618 C.E.)
    1. Syncretic Philosophies in the Qin and Han Periods
      1. Master Han Fei (c. 280-233 B.C.E.) and Legalist Philosophy
      2. The Masters of Huainan (Huainanzi)
      3. The Luxuriant Dew of the Spring and Autumn Annals of Dong Zhongshu
    2. The Rise of Critical Philosophy in China: Wang Chong (25-100 C.E.)
    3. Profound Learning (Xuanxue)
  3. Early Buddhism in China
    1. The Dhammapada (Chinese translation, c. 224 C.E.)
    2. Tiantai Buddhism
    3. Consciousness-only Buddhism
    4. Chan Buddhism
  4. The Song Period (960-1279 C.E.) and Neo-Confucianism
    1. Morality Books of the Three Teachings (Sanjiao) Tradition
    2. Neo-Confucianism: The Original Way of Confucius for a New Era
      1. Zhou Dunyi (1017-1073)
      2. Cheng Hao (1032-1085 C.E.) and Cheng Yi (1033-1107 C.E.)
      3. Zhu Xi (1130-1200 C.E.) and the Neo-Confucian Synthesis
      4. Wang Yangming (1472-1529 C.E.)
  5. The Chinese and Western Encounter in Philosophy
    1. Dai Zhen (1724-1777 C.E.)
    2. Kang Youwei (1858-1927 C.E.)
    3. Zhang Dongsun (1886-1973 C.E.)
    4. Hu Shi (1891-1962 C.E.)
    5. Mao Zedong (1893-1976 C.E.)
  6. Whither China? Philosophical Views
    1. Kang Xiaoguang (b. 1963 C.E.)
    2. Tu Wei-ming (1940-) and New Confucianism
  7. References and Further Reading

1. Classical Chinese Philosophy in the Pre-Qin Period (before 221 B.C.E.)

a. The “Great Commentary (Da Zhuan)” to the Classic of Changes (Yijing)

In terms of a repository of philosophical reflection on the nature of reality and the human place in it, the story of Chinese philosophy may be said to begin with the Classic of Changes (Yijing). This work is composed of two parts: 1) a quite ancient manual of divination known simply as the Changes (Yi), or, more correctly, as the Zhouyi because it is a handbook of practices and procedures are traceable to the period of the Western Zhou dynasty (c. 1046-771 B.C.E.) and 2) a set of seven commentaries (zhuan) attached to the Yi and traditionally ascribed to Confucius, although there is no firm evidence that he wrote them, or even that he used them. Three of the commentaries are composed of two sections each, so taken as a whole, the commentary set is known as “The Ten Wings (Shiyi).” One of the commentaries to the Yi is known by the various titles of “The Great Commentary (Da Zhuan)” or “Appended Statements (Xici).” For a study of philosophy, “The Great Commentary” is arguably the most important single offering an understanding of the earliest written understanding of Chinese ontology currently available to us.

Edward Shaughnessy (1997) has done a recent translation of The Classic of Changes based on the Mawangdui archaeological finds. He offers reasons for thinking the work was edited most likely during the long period from 320-168 B.C.E. While it is true that some material in the “Great Commentary” may have its origin as late as the Han dynasty, there is clear evidence in concepts and reasoning of a much earlier period in the text, as well. The “Great Commentary” provides a clear exposition of the early Chinese worldview that all things are in a constant process of change. Readers will notice that “Yi” is sometimes used for the Zhouyi/Yijing as a divination guide and sometimes simply for “the process of reality” itself.

b. Confucius (551-479 B.C.E.) of the Analects

The earliest association of Chinese philosophy with a specific figure whose work is not only still extant, but widely used, is that of Confucius (personal name Kong Qiu, also known as “Master Kong” or Kongzi, 551-479 B.C.E.). Confucius was born, lived, and taught during the classical period of China. His philosophical teachings were gathered and transmitted largely, but not exclusively, in a work known as the Analects (also known as Lunyu, meaning “Selected Sayings”). This book is composed of short texts and brief conversations in which Confucius is often, but not always, the main teacher. The received version of the Analects is divided into 20 books that are further categorized with the convention of listing the book first, then the analect (that is, 3.1 is Book Three, analect one). Recent textual critical studies of the received text of the Analects have identified various strata in the collection, according to which some analects are more likely to be traceable to the historical Confucius, others to his disciples, others to master teachers associated with him or a generation removed, and still others that may be several generations removed from Confucius himself.

Ronnie Littlejohn’s understanding of this structure divides the text into the following categories: basic teachings on philosophical concepts probably traceable to Confucius (for example, Book 4); comments on disciples and personages by Confucius (for example, Book 5); collection of teachings to specific students by topic (for example, Book 12); and later materials codified for transmission by students and later masters (for example, Books 19 and 20) (Littlejohn 2011). There are many fine complete English translations of the Analects, some of which are available online.

Confucius stood within the tradition of scholars called Ru (儒). In the Han dynasty (206 B.C.E.-220 C.E.), two centuries after Confucius’s life, Liu Xin (46? B.C.E.-23 C.E.) says the Ru first appeared as an identifiable professional group in the early Zhou dynasty (c. 1046-256 B.C.E.). They were noted for their allegiance to the sage kings of ancient China who followed what they called “the Way (Dao) of Heaven” and its social, religious, and moral proprieties (li 禮). Liu Xin tells us the Ru were devoted to the “Six Classics (Liujing)” and took Confucius as their master teacher.

The “Confucianism” of teachers and literati who studied, modified, and applied Confucius’s ideas began robustly during the Han dynasty and continues in some forms down to the 21st century. Confucianism struggled throughout Chinese history with other intellectual streams, including Daoism and Buddhism. In the 12th century, Zhu Xi (1130-1200 C.E.) assembled a set of works known simply as the “Four Books,” which he took to represent the core of Confucian teachings: the “Great Learning (Daxue),” the “Way of Balance (Zhongyong),” the Analects, and the Mencius. This collection became the curriculum for China’s civil service examination system down to the year 1911, as well as for similar national exams in Japan, Korea, and Vietnam. During the period 1966-1976, Confucius and Confucianism were attacked as feudalistic and oppressive. Not until the mid-1980s did a recovery of Confucian philosophy begin with the so-called New Confucians.

Unlike the “Great Commentary,” Confucius’s teachings in the Analects are not concerned with ontology or cosmology but rather with human self-development and social ethics. Because of the prominence of Confucian thought both in China and in the early encounter of Western philosophy with Chinese philosophy, it was often said that Chinese thought is only socio-political or ethico-moral in its interests. This is, of course, not true at all. However, it is a not an inaccurate representation of Confucius’s thought.

Based on the development of Confucianism in history, the following concepts from the Analects can be identified with confidence as central to Confucius’s thought and contribution to Chinese philosophy: ren 仁 (humaneness, benevolence); junzi 君子 (exemplary person, gentleman); yi 義 (righteousness, appropriate behavior for the situation), xiao 孝 (filiality), and li 禮 (ritual propriety).

c. Mozi (c. 470-391 B.C.E.) and Mohism

Although it is little known and not influential today, the “Mohist School” was one of the most influential movements in pre-Qin China. The thinkers in this tradition were students and later followers of Mo Di (also known as “Master Mo” or Mozi, c. 470–391 B.C.E.). According to the Records of the Historian (Shiji) by Sima Qian, Mozi was an official of the state of Song, and he lived just after Confucius. Our primary source for his thought is a collection of materials edited into a large anthology simply called the Mozi, although this text also contains materials much later in origin than the historical Mo Di, thus representing more fully Mohism as a school or movement. The Mozi contains essays, short dialogues, anecdotes, and compact philosophical discussions. One part of the text sets out the “Ten Core Doctrines” of Mohism in a triad of essays, each exploring the same principal ideas and often containing repeated language and examples. The essay layers are designated as shang 上, zhong 中, and xia 下 (that is, upper, middle, and lower). Just why there are three versions of each core doctrine is not certain, but the prevailing theory is that the triads are probably versions of oral and/or written traditions representing three different lineages of Mohism coming down from the historical Mozi. It is clear, though, that between the triads there are some divergences in philosophical positions between the versions. Some of these are of little consequence, but others are more significant. There is no attempt to harmonize all of these into a monolithic version.

An English translation of the entire Mozi is Ian Johnston’s The Mozi: A Complete Translation (2010). The Mozi contains philosophical reflections on a wide range of questions and problems. Mozi is revealed as a master of argument, making him the foremost representative of the “debaters (bianshi)” of the classical period. He sets forward the earliest form of consequentialism in political and moral thought, opposes military aggression, advocates state welfare for the people, holds to an absolute merit-based principle for most levels of political leadership, and consistently advocates for the folk religious beliefs of his day. The Mozi takes sophisticated positions on logic, epistemology, causality, and language. Arguably, the central moral idea of Mozi is jian ai 兼爱, which may be rendered as “universal love” or “impartial concern.”

d. The School of Names, Mingjia 名家 (Disputers, Dialecticians, Bianshi)

Included among the dialecticians (debaters, bianshi) associated with what we may call the School of Names, Hui Shi (350-260 B.C.E.) and Gongsun Long (320-250 B.C.E.) are easily the most prominent. Other thinkers often mentioned are Deng Xi and Yin Wen. Unfortunately, with the exception of the partial anthology Gongsunlong Zi, the works of these thinkers have all been lost.

In both the Zhuangzi’s chapter 33 and in Sima Tan’s remarks on “the six schools,” the School of Names and its dialecticians are somewhat ridiculed for making minute examinations of trifling points or intricate distinctions in the use of terms. For example, they are associated with making philosophical points about the distinctions between shapes and colors, unity and plurality, similarity and difference.

However, the philosophers practicing this methodology were interested in demonstrating that our understanding of the world is a radical form of perspectivism or relativism. They sought to move persons out of dogmatic positions, which tended to elevate particular points of view to absolute truths. In Zhuangzi 33, twenty-one theses associated with these thinkers, are included and they are most often taken by interpreters as examples full of counterintuitive and even absurd implications in ways familiar to the characterizations of Zeno in Western thought. Ten theses of Hui Shi, also reported in Zhuangzi 33, are examples of these methods.

e. The Daodejing

The long-standing tradition in China is that an individual philosophical master named Laozi was the author of a philosophical work known as the Daodejing, which means the “Classic of Dao (the Way) and its De (virtuous power).” This understanding of authorship is almost universally rejected by scholars now in favor of the view that the text is a collection of materials from “ancient masters” collected in different versions beginning in about 300 B.C.E. and continuing until the standard edition made by Wang Bi sometime between 226 and 249 C.E. Nonetheless, the impact of the Daodejing has been monumental as the classical representative of the tradition of Daoism, often characterized as the yin (passive) spirit in Chinese philosophy, where Confucianism is regarded as the yang (active). The remarks in the Daodejing counsel naturalness, simplicity, and spontaneous action without effort according to the movement of Dao, while Confucianism advocates active cultivation of one’s nature by learning, vigorous effort and political involvement, and conformity with established proprieties reconceived in their application by each new generation.

The Daodejing is one of the most translated texts of China into other world languages and, as with the Analects, there are many English versions of the complete text. Among these, some that stand out include P. J. Ivanhoe, The Daodejing of Laozi, D. C. Lau, Tao Te Ching, and Michael LaFargue, The Tao of the Tao-te-ching. Textual, literary, and redaction-critical approaches have shown that the received text is a collection of teachings used across lineages of Daoist teachers and not the work of a single author, although in its received form a final editor did collate its current arrangement. The component aphorisms and remarks of the Daodejing are strung together somewhat like beads on a string by an editor or editors.

f. The Zhuangzi

The Zhuangzi is one of the formative texts of classical Chinese Daoism traditionally ascribed to the philosopher Zhuang Zhou (c. 365?-290? B.C.E.). The received text was edited by a scholar official named Guo Xiang (d. 312 C.E.) and contains 33 chapters. Most of these, like those in the Daodejing, contain many component logia. However, unlike the case of the Daodejing, we know that there was a much larger and older Zhuangzi. This “lost Zhuangzi” consisted of 52 chapters, and it is mentioned on a list in Imperial bibliographies dating from about 110 C.E. Within the Zhuangzi are cycles of materials related to Laozi and other characters, real and imaginary. Contemporary scholars, such as Liu Xiaogan (1994), Harold Roth (1991), and Ronnie Littlejohn (2010), have all suggested models for understanding the structure of the text of the Zhuangzi. The following represents the textual division by Littlejohn:

Inner Chapters (chs. 1-7) contain a number of logia that may be attributed to Zhuang Zhou and very likely represent the oldest material in the book.

Daode Chapters (chs. 8-10e) represent a clear break in the text and form a coherent essay, often using the first person and employing illustrations of its points internal to the essay. The essay is not interrupted by any disconnected logia. As such, it is likely that the essay was written by a single individual who made use of texts and themes, some of which are also found in the Daodejing.

Yellow Emperor-Laozi Chapters (also known as Huang-Lao Daoism) (largely chs. 11-16, 18, 19, and 22) are traceable to a lineage of Daoist teachers that developed during and after the heyday of the Jixia Academy (318?-284? B.C.E.) and had distinctively different emphases than those found in the other layers of the Zhuangzi and in the Daodejing. The earliest look that we get at the characteristics of this important tradition in Daoist history is in the Zhuangzi itself. The Masters of Huainan (Huainanzi, 139 B.C.E.) represents a continuation and maturing of these ideas.

Zhuangzi Disciples Chapters (largely Chapters 17-28) contain logia associated with the earliest disciples and second-generation transmitters of Zhuang Zhou’s teachings and that have close connections with ideas in the Inner Chapters (Littlejohn 2010).

g. Mencius (c. 372-289 B.C.E.)

 If our ancient sources are correct in their chronologies, Mencius (that is, Meng Ke, Mengzi, or “Master Meng,” c. 372?-289? B.C.E.) was a contemporary of Zhuang Zhou, the Daoist master. The text coming down to us as the Mengzi contains virtually all of his significant teachings. Within the Confucian stream of Chinese philosophy, Mencius’s influence was so significant that he became recognized as the most authoritative interpreter of Confucius’s teachings and was known as “Mengzi the Second Sage.” He was a defender of Confucianism during the period of the Hundred Schools of Thought during the so-called Spring and Autumn (771-476 B.C.E.) and Warring States (475-221 B.C.E.) periods of Chinese history. Mencius was likely one of the major teachers at what has been called the Jixia Academy (318?-284? B.C.E.). The Mengzi that contains his philosophical remarks later became one of the “Four Books (Sishu)” that formed the core of the Confucian examination and education system for centuries.

The Mengzi appears to have been collected by Mencius’s disciples, some of whom are referred to in the text as “masters” themselves, indicating a later period of composition for those passages. The received text was edited by Zhao Qi (d. 201 C.E.) into seven books, each in two parts, and each part with a number of passages. When scholars cite the Mengzi, the form is always in this manner: book, section, passage (that is, 3B9). This citation form enables the reader to locate the passage in any of the complete translations of the text. Among the best full-text translations of the Mengzi are D. C. Lau, Mencius; Bryan Van Norden, Mengzi with Selections from Traditional Commentaries; and Irene Bloom, Mencius (completed and edited by Philip J. Ivanhoe).

h. Xun Kuang or Xunzi (c. 325?-235? B.C.E.)

 What little is known of the life of Xun Kuang (also known as Master Xun or Xunzi, c. 325?-235? B.C.E.) is culled from evidence in his own writings and from the brief biography written by the historian Sima Qian some hundred years or so after Xunzi’s death. If we are right about Xunzi’s year of birth, he would have been around 20 years old when Mencius died. Sima Qian reports that Xunzi studied at the Jixia Academy, and it is quite possible that he was well acquainted with Mencius’ ideas directly or through first-generation disciples. He and his disciples seem to have been highly regarded by the rising Qin rulers. In fact, two of his students, Han Fei and Li Si, were instrumental in developing the theory of law and justice used during the Qin dynasty (221-206 B.C.E.) and known simply as Legalism. The primary source for Xun Kuang’s thought is known simply as the Xunzi. This book consists of 32 chapters that are essentially well-crafted, self-contained essays. Interestingly, though, the Xunzi was not a part of any of the later lists of Confucian classics in the canon, very much unlike the Mengzi that became part of the Four Books and occupied a central place in Confucian learning.

For years, the standard English translation of Xunzi was that by John Knoblock (1988, 1990, 1994), but in 2014 a new complete version appeared by Eric Hutton.

2. Philosophy from the Qin (221 B.C.E.) to the Tang (618 C.E.)

a. Syncretic Philosophies in the Qin and Han Periods

During the Qin and Han periods, it was not uncommon to gather communities of scholars together and also collect numerous texts, all from different philosophical traditions. A result of this process was the creation of works that attempted to unify and synthesize previous learning, representing an effort to create a harmonized body of truth. Two of these syncretic works are the Hanfeizi and the Masters of Huainan (Huainanzi).

i. Master Han Fei (c. 280-233 B.C.E.) and Legalist Philosophy

 Master Han Fei (Hanfeizi, c. 280-233 B.C.E.) was a student of Xunzi, probably at the Jixia Academy. His essays, gathered into the work Hanfeizi, were most likely written for the kings of the Han state, King Huan Hui (r. 272-239 B.C.E.) and King An (r. 238-230 B.C.E.). Han Fei is regarded as a principal representative of the “Legalist School (fa jia).” The “Legalist School” refers loosely to Chinese philosophers of the classical period whose common conviction was that law rather than morality was the most reliable ordering mechanism for society. A number of philosophers associated with this school were active in government and as imperial consultants. Han Fei himself was an advisor in the Han state just prior to its annexation by the Qin during the consolidation of China’s first empire in 221 B.C.E. Wenkui Liao’s translation, The Complete Works of Han Fei Tzu with Collected Commentaries, is available electronically at The Institute for Advanced Technology in the Humanities University of Virginia, ed. Anne Kinney.

ii. The Masters of Huainan (Huainanzi)

According to his biography in the Book of the Early Han, Liu An (179-122 B.C.E.), the king of Huainan (in modern Anhui province) and uncle of Han Emperor Wu, gathered a large number of philosophers, scholars, and practitioners of esoteric techniques to Huainan roughly in the period 160-140 B.C.E. to debate and synthesize all learning. The collected volume now known as Masters of Huainan (Huainanzi) was a product of this interchange of ideas. It was presented to emperor Wu in 139 B.C.E. as a suggested program of rulership, although it was rejected, leading perhaps to the death of Liu An himself.

The Masters of Huainan is a synthetic document meant to harmonize the thought of the so-called “Hundred Schools (zhuzi baijia)” as a sort of universal encyclopedia of knowledge, although most scholars hold that its primary influence is associated with what is known as Yellow-Emperor Daoism (Huang-Lao Daoism). In its received form, it is a work of 21 essays ranging in subject matter from cosmology and astronomy to inner qi (vital energy) cultivation, bio-spiritual transformation, and political rulership.

The first complete English translation of the text is The Huainanzi: A Guide to the Theory and Practice of Government in Early Han China, by John Major, Sarah Queen, Andrew Set Meyer, and Harold Roth (2010).

iii. The Luxuriant Dew of the Spring and Autumn Annals of Dong Zhongshu

Dong Zhongshu (c. 198-104 B.C.E.) was more successful than Liu An in crafting a philosophical vision attractive to the Han rulers. Dong was one of the central figures involved in the resurgence of Confucianism and the Confucian classics in the Han Dynasty. His version of Confucianism drew within it the cosmologies of the five phases (wuxing) and the yin-yang school prominent during the Han period. The Luxuriant Dew of the Spring and Autumn Annals (Chun Qiu Fan Lu) is a work in 17 parts, containing 123 chapter titles, of which 79 chapters survive. Although traditionally ascribed solely to Dong Zhongshu, it shows the signs of multiple editorial hands and cannot be attributed in its entirety to him.

Selections from Dong have been translated by Mark Csikszentmihalyi and are included in Readings in Later Chinese Philosophy: Han to the Twentieth Century.

b. The Rise of Critical Philosophy in China: Wang Chong (25-100 C.E.)

Wang Chong (25-100 C.E.) studied in the imperial school in Luoyang, Henan province. After his training, he returned to his home near modern Shangyu, Zhejiang province in the position as Officer of Merit. His writings on subjects ranging from morality, to government, to science and technology were compiled into the work Critical Essays (Lunheng). Actually, each of the essays is meant to stand alone as a separate philosophical analysis, and there is no attempt to harmonize any seeming contradictions or inconsistencies apparent between the essays that come into view when the collection is read as a whole. Wang is generally acknowledged as a philosopher who is critical of many traditional beliefs of his day. He does not believe Heaven interferes with natural happenings, neither does it reward and punish persons for their actions, as Mozi thought it did. Destiny, chance, and luck are more important operators for describing what happens to us in his philosophy. Wang thinks human activity is actually of little consequence in the grand sweep of reality, and he largely disconnects happiness and unhappiness from the notions of legal reward and punishment, or even from any direct connection to our moral actions. He especially rejects reports of what we would call supernatural occurrences and interventions in human life and nature.

Alfred Forke’s translation of Wang’s Critical Essays is available online at http://www.humanistictexts.org/wangchung.htm.

c. Profound Learning (Xuanxue)

The movement known as Profound (or Mysterious) Learning (Xuanxue) has been labeled “Neo-Daoism.” This generalized term once was used to refer to the period of development of Chinese philosophy from the decades immediately preceding the fall of the Han dynasty to approximately the early 300s C.E. However, the term is misleading and no longer in favor due to the fact that the movement it seeks to describe claimed no particular Daoist sectarian identity but instead encapsulates a complex set of fresh insights and intense debates about new directions in Chinese thought.

Major figures generally associated with Profound Learning include He Yan (c. 207-249 C.E.), Wang Bi (226-249 C.E.), and Guo Xiang (d. 312 C.E.). Generally speaking, all three of these philosophers were working with the syncretic philosophies of the late Han as their background, but they were seeking to make new interpretations of original classical sources such as the Yijing, the Daodejing, and the Zhuangzi, or what were known as “the Three Profound Treatises (sanxuan).” Wang Bi and Guo Xiang edited and commented on what may now be called the standard texts of the Daodejing and the Zhuangzi. He Yan commented on the Analects. All of the philosophers in this tradition were seeking to demonstrate a unity of the Chinese classical texts into one tradition. However, this is not to say that the three thinkers mentioned here shared the same interpretations of Chinese concepts, nor that they even ranked the classical texts and thinkers in the same priority, although all valorized Confucius.

3. Early Buddhism in China

Buddhism first reached China from India roughly 2,000 years ago during the Han dynasty. It is generally agreed now that Buddhism entered along several different trade routes in the 1st century C.E., both in northern and southern regions of China, but the northern route known simply as the Silk Road is still regarded as the line along which Buddhist monks, believers, and traders had the most prominent manner of entry. The Buddhists entering along this route established famous monastic and study sites at places such as Dunhuang, Chang’an (Xi’an), and Luoyang, leaving behind marvels of art and architecture, as well as fascinating texts. As early as the 2nd century C.E., a few Buddhist monks, such as Lokaksema (Zhi Loujiachen, 147-? C.E.), a monk from Gandhara, began translating Buddhist sutras and commentaries from Sanskrit into Chinese. The most famous of such monks was Xuanzang (602-664 C.E.), whose travels to India to acquire texts and create a translation school at Chang’an are both made famous in historical records, as well as the classic Chinese novel Journey to the West (Xiyou ji).

a. The Dhammapada (Chinese translation, c. 224 C.E.)

The Dhammapada (Fa Jujing) was translated into Chinese about 224 C.E., and the tradition is that it represents a 423-verse sermon attributed to the historical Buddha, that is, Siddhartha. This work is often neglected in a study of Buddhism’s early impact on Chinese philosophy. While it is arguably the most popular work in the Pali Canon, how it came to China and just how widely it was used are still matters of debate. Nevertheless, it represents well the earliest texts introducing the new way of thinking known as Buddhism into the Chinese philosophical tradition. The selection that follows is taken from John Richards’s 1993 translation available electronically at http://www.geocities.ws/sharibushariputra/SharibuShariputra/BuddhaDharma-Dhammapada_1.htm. The verse number is provided at the end of each teaching.

b. Tiantai Buddhism

The Tiantai School of Buddhism (Tiantai zong) was entirely of Chinese origin. Tiantai grew and flourished as a Buddhist school under its fourth patriarch, Zhiyi (538-597 C.E.), who asserted that the Lotus Sutra (that is, The Sutra of the Lotus Blossom of the Subtle Dharma, Miaofa Lianhua Jing) contained the supreme teaching of Buddhism. The school derives its name from the Tiantai Mountain that served as its most important monastic community and the one at which Zhiyi studied.

 

Two of Zhiyi’s most important philosophical teachings are “the Ten Ways of Existing in Reality” and “The Threefold Truth.”

The most distinctive ontological claim of Tiantai is that there is only one reality, which is both the phenomenal existence of our everyday experience and nirvana itself. There is no transcendent dimension or place that exists apart from the reality we are experiencing here and now. In fact, Tiantai writings describe 10 ways one may exist in reality:

  1. Hell Beings
  2. Hungry Ghosts
  3. Beasts (that is, beings of animal nature)
  4. Asuras (demons)
  5. Human Beings
  6. Gods or celestial creatures
  7. Voice-hearers (Skravakas)
  8. Self-enlightened Ones (Pratyekabuddhas)
  9. Bodhisattvas
  10. Living Buddhas

In Tiantai ontology, the reality that the Hell Beings inhabit is the same reality in which the Buddhas live. There is no supernatural boundary between these ways of existing or transcendent place to which some go (for example, Heaven), while others dwell elsewhere (Hell). Living and working next to us may be one who is a Hell Being or a Bodhisattva or even a Buddha. Indeed, we ourselves may be demons or Bodhisattvas, depending on whether we follow the Buddhist way.

Zhiyi’s Teaching of the Threefold Truth (san di) may be summarized in the following way. 1) We can make true statements about the world of existing things. These truths are about things that exist and their interactions in a network of interdependent causes. These are the truths of history, science, and so forth. The truth of a statement here is verified by testing it over against the world of our experience. 2) It is also true to say that all things are empty (kong di) and have no permanence. There is no permanent essence to anything in our world of experience, including ourselves. Everything in reality is devoid of any permanent essence. 3) The third character of truth is that the mundane or phenomenal world is real and at the same time it is impermanent and ultimately empty.

The Great Calming and Contemplation (Mohe zhiguan), a massive treatise of edited lectures by Zhiyi on meditation, offers the teaching that we may dwell in one or more of the Buddhist 10 realms at any given time. The more one moves in calm and contemplation toward Buddha consciousness, however, the more the other realms of consciousness recede and eventually dissipate. The contents of the work are organized into 10 chapters, which systematically trace the perfect path of calming and contemplation to the final actualization of Buddhahood itself. The translation by Daniel Stevenson (1996) is available electronically at http://chancenter.org/cmc/1996/08/26/selections-from-chi-is-great-calming-and-contemplation/.

c. Consciousness-only Buddhism

Xuanzang (602-664 C.E.), born Chen Hui, was a Chinese Buddhist monk, scholar, traveler, and translator in the early Tang dynasty. Born in Chenhe village, near present-day Luoyang in what is now Henan province in 602, his family was well educated. Although he received an orthodox Confucian education, he lived for five years at Jingtu monastery (Jingtu si) in Luoyan. He spent more than 10 years traveling and studying in India. When he returned, he brought back 657 Buddhist texts and devoted the remainder of his life to a translation school he established in Chang’an (Xi’an). His travels in India are recorded in detail in the classic Chinese text Great Tang Records on the Western Regions (Da Tang Xiyuji), which in turn provided the inspiration for the fictitious religious novel Journey to the West (Xiyou ji) written by Wu Cheng’en during the Ming dynasty, around nine centuries after Xuanzang’s death. Xuanzang’s creation in China of the “Consciousness-only” School of Buddhism (Weishi zong) was greatly influenced by the writings of the Indian Yogacara master, Vasubandhu (Chinese name, Shi Qin). Xuanzang wrote an extensive commentary in 10 volumes on Vasubandhu’s text Thirty Stanzas of Consciousness-Only entitled, A Treatise on the Establishment of Consciousness-Only (Cheng Wei-shi Lun), and used it to set out his own views of this tradition of Buddhist teaching. The only complete English translation of Xuanzang’s Treatise is by Tat Wei (1973), but Chan’s Sourcebook (1973: ch. 23) contains excerpts from it.

The central ontological tenet of Consciousness-only Buddhism is that nothing exists but consciousness. Of course, this is in direct conflict with early Chinese ontology since qi is an energy that may produce consciousness but is not itself a form of consciousness. According to Consciousness-only philosophy, we have a flow of experienced ideas that we also call perceptions. However, these ideas or perceptions are not caused by concrete or material things external to us and that continue to exist whether we are conscious of them or not. In philosophical language, the ontology of Consciousness-only is simply called Idealism.

d. Chan Buddhism

Chan Buddhism developed in China between the 6th and 8th centuries C.E. It is regarded as a uniquely Chinese form of Buddhism that later was transplanted into Japan, where it became prominent as Zen. The Chinese word chan is used to translate the Sanskrit dhyana, which means “meditation.” Although regarded as Chinese in origin and tenor, the founding legend of Chan is that the Buddha transmitted a private esoteric teaching, never written on any sutra, but passed only from one teacher to another. The twenty-eighth patriarch in this lineage of transmission is known as Bodhidharma (470-543 C.E.), and he is said to have brought the teaching to China.

In the history of Chan, there is a Northern and Southern School. The Northern followed Shenxiu (c. 605–706 C.E.) as its patriarch, and the Southern followed Dajian Huineng (638–713 C.E.). According to the Platform Sutra of the Sixth Patriarch, regarded as the canonical expression of Chan philosophy, the split between these schools arose over who should succeed Hongren (601–74 C.E.) who was the fifth patriarch of Chan. The sutra tells the story of Huineng’s ascendancy to that role.

Chan’s theory of knowledge is concerned with one’s own mind, elevating in importance that which is known by the mind through immediate, direct acquaintance. When Chan philosophers claim that we know the world through our own minds, they do not identify mind with the thoughts presently in front of us. They mean one’s original mind, before the mind was clouded over by experiences or human distinctions made in language. It is in our original minds that we have awareness of absolutely certain truth. D. T. Suzuki says this knowledge “is not derivative but primitive; not inferential, not rationalistic, not mediational, but direct, immediate; not analytical but synthetic; not cognitive, but symbolical; not intending but merely expressive; not abstract, but concrete; not processional, not purposive, but ultimate, final and irreducible; not eternally receding, but infinitely inclusive; etc.” (1956: 34).

The Platform Sutra of the Sixth Patriarch (Liuzu Tanjing) presents itself as a written transcription of the lectures of Huineng. Philip J. Ivanhoe’s translation is based on the Dunhuang version.

4. The Song Period (960-1279 C.E.) and
Neo-Confucianism

a. Morality Books of the Three Teachings (Sanjiao) Tradition

The village lecture system of Song dynasty China made use of morality books (shanshu) to create, shape, and transmit a unified moral culture throughout the empire. Arguably, the most important of these was Tract of the Most Exalted on Action and Response (Taishang ganying pian). The Tract likely reached its final form between the 10th and 12th centuries C.E., and it is still widely available today. Although primarily a work of Daoist spiritual piety and relatively brief in length, having only 1,277 characters, it shows numerous Buddhist and Confucian influences and moral injunctions as well, thereby representing in itself the “Three Teachings (sanjiao)” of China. The work attributes its own authorship to Taishang, by which is meant Laozi. The term ganying employed in the title of the work is a way of speaking about sowing and reaping or receiving the results of one’s action, a concept used often in the Masters of Huainan and frequently associated with the Buddhist notion of karma when it entered China. The work builds on earlier Chinese moral tracts of similar scope and teaching such as The Code of Nuqing for Controlling Ghosts (probably composed sometime between 143 and 224 C.E.) and Ge Hong’s (283-343 C.E.) merit system in The Master Who Embraces Simplicity (c. 316 C.E.).

Like these earlier works, the Tract represents an extremely quantitative view of morality, relying on the counting of good and evil deeds as way of predicting coming blessings or punishments, including the shortening or lengthening of life. In the introductory remarks of the Tract, Taishang says that moral transgressions reduce a person’s lifespan and poverty comes upon the immoral person. The immoral person meets with calamity and misery, and all men hate him. In this work, Taishang reports a bureaucracy of numinal beings who are record keepers in charge of recording the good and evil deeds of every individual. According to the text, those who wish to attain to a celestial spiritual life should perform a net result of 1,300 good deeds, and those who wish to attain an indefinite earthly life should perform 300. One’s moral deeds are kept, as it were, on a ledger and counted by the celestial powers. Evil deeds do not disappear from one’s ledger, indicating their lasting effect, but they may be counter-balanced by good works.

T. Suzuki’sand Paul Carus’ translation is entitledTreatise on Response & Retribution.

b. Neo-Confucianism: The Original Way of Confucius for a New Era

i. Zhou Dunyi (1017-1073)

Zhou Dunyi’s (1017-1073 C.E.) Diagram of the Supreme Ultimate Explained (Taiji tushuo) is a work of importance to the articulation of the common understanding of the structure of reality that we find in the Neo-Confucian thinkers, including Cheng Hao (1032-1085 C.E.), Cheng Yi (1033-1107 C.E.), and Zhu Xi (1130-1200 C.E.). All of the most important concepts of the Chinese worldview as it was being understood and remade during the 11th to 13th centuries C.E. are present in Zhou Dunyi’s essay: qi, yin and yang, the five phases (wuxing), principle (li 理), and the trigrams and hexagrams of the Yijing. A translation may be found in Bryan W. Van Norden’s and Justin Tiwald’s Readings in Later Chinese Philosophy: Han to the Twentieth Century.

ii. Cheng Hao (1032-1085 C.E.) and Cheng Yi (1033-1107 C.E.)

The Cheng brothers made a powerful impact on the development of Neo-Confucian thought. Cheng Hao was one of the principal figures of the Neo-Confucian movement, and his work connected ontology and morality in a skillful way. His brother Cheng Yi reinterpreted a number of key figures and ideas in Chinese classical philosophy, giving them a distinctive Neo-Confucian flavor. The translations of their work by Philip J. Ivanhoe in Readings in Later Chinese Philosophy: Han to the Twentieth Century are based upon the Chinese texts found in Collected Works of the Two Chengs (Er Cheng ji).

iii. Zhu Xi (1130-1200 C.E.) and the Neo-Confucian Synthesis

Zhu Xi was born in Youqi in Fujian province, China in 1130 C.E. His early interests were in Daoism and Buddhism, but he became the student of Li Tong (1093-1163 C.E.). Li worked within the philosophical tradition of Cheng Hao and Cheng Yi. Zhu Xi compiled an anthology of these thinkers known as Reflections on Things at Hand that became essentially the primer for Neo-Confucianism for generations. If we were to compare him to Western philosophers of the same far-reaching influence, we would take note of Aristotle’s influence in the classical period, Thomas Aquinas in the Medieval period, and Immanuel Kant in the Enlightenment period. He ranks along with Confucius and Mencius as one of the three preeminent thinkers of China. As such, his philosophy represents the most thoroughgoing example of Neo-Confucianism.

One of Zhu Xi’s greatest accomplishments was collecting and compiling the Four Books (sishu), which were made the foundation of the all-important imperial examinations. His systematization of Confucianism into a coherent program of education became the foundation for educational systems in China, Korea, and Japan. Zhu Xi’s oral teachings to students are preserved in Conversations of Master Zhu, Arranged Topically or Categorized Conversations (1270). The translation of excerpts from this text by Bryan W. Van Norden in Readings in Later Chinese Philosophy: Han to the Twentieth Century is based on Zhuzi yulei, vol. 1 (1986 reprint edition) as well as Fung Yulan (Feng Youlan)’s A History of Chinese Philosophy: The Period of Classical Learning, vol. 2.

iv. Wang Yangming (1472-1529 C.E.)

Wang Yangming, a Ming dynasty general and official, practiced Daoist “sitting in forgetfulness (zuowang),” grasped the realization of the unity of knowledge and action that Daoist thinkers know as wu-wei, and taught that the highest form of knowledge was what he called “pure knowledge (liangzhi),” resembling in many ways the epistemology of Chan (Zen) Buddhism. The principal sources for Wang’s ideas are his works, A Record for Practice (1518 C.E., Chuan Xilu) and “Inquiry on the Great Learning” (1527 C.E., Daxue Wen). Excerpts from these texts have been translated by Philip J. Ivanhoe in Readings from the Lu-Wang School of Neo-Confucianism.

Wang actually had a rather stormy career due in large measure to his opposition to the philosophy of Zhu Xi. He departed from Zhu in both his ontology and epistemology. In fact, during the Ming dynasty (1368-1644 C.E.), Wang Yangming became the most deliberative of Zhu Xi’s critics, even if he continued to use much of the philosophical vocabulary of Zhu Xi and other Neo-Confucians.

5. The Chinese and Western Encounter in Philosophy

a. Dai Zhen (1724-1777 C.E.)

Dai Zhen was born in Longfu City (Tunxi city) in Anhui Province into the family of a poor cloth merchant. He devoted himself to the study of the basic works of Chinese philosophy. His two most prominent philosophical works are entitled On the Good (Yuanshan) and An Evidential Commentary on the Meaning of Terms in the Mengzi (Mengzi Ziyi Shu). Dai Zhen’s Evidential Commentary is organized into several parts, each devoted to a particular philosophical term or phrase. His approach is to begin the analysis of each important concept with a philological analysis. Sometimes he shows how the term or phrase was used in selected passages in the history of Chinese philosophy. One of his principal goals is to correct misunderstandings, most particularly those he associates with the Neo-Confucians with respect to their views on reality’s Principle(s) (li) and our human desires. Dai makes the point that, unlike Buddhism’s rejection of all desire as the root of suffering, some desires may actually be positive. Selections from this text, translated by Justin Tiwald, may be found in Readings in Later Chinese Philosophy: Han Dynasty to the 20th Century.

b. Kang Youwei (1858-1927 C.E.)

Kang Youwei was a committed Chinese nationalist in the last years of the Qing dynasty (1644-1912 C.E.). He developed a philosophical construction of a utopian state entitled Book of Great Unity (Da Tong Shu), which should be considered along with other such political visions developed in world philosophy as Plato’s Republic. The work was not published in its entirety until 1935, eight years after Kang’s death. Laurence G. Thompson’s translation of Book of Great Unity is entitled The One-World Philosophy of K’ang Yu-wei.

c. Zhang Dongsun (1886-1973 C.E.)

Zhang Dongsun was well educated in the philosophy and method of the Western philosopher Immanuel Kant. He even interpreted Confucianism along Kantian lines. As an intellectual, he was quite active in government during the early years of the People’s Republic of China but was sent to a re-education camp during the Cultural Revolution (1966-1976 C.E.). He is best known for articulating a “pluralistic epistemology” that emphasizes the importance of sociology, culture, and language in the shaping of worldviews and philosophical approaches. His essay, “A Chinese Philosopher’s Theory of Knowledge,” appears in Our Language and Our World: Selections from Etc.: A Review of General Semantics.

d. Hu Shi (1891-1962 C.E.)

 Although influenced by Buddhism in his youth, Hu Shi studied in Shanghai in three schools known for their curriculum called “the New Education,” which was a reference both to a Western style of learning and its content. He later completed his Ph.D. in Philosophy under the direction of John Dewey at Columbia University in the U. S. A. After completing his doctorate in 1917 C.E., he returned to China to become professor of Chinese and Western philosophy at Beijing University. He was instrumental in the development of the New Culture Movement (1912-1920 C.E.) that was dedicated to the modernization of Chinese learning and social progress. He was also a key figure in introducing Pragmatism and scientific research methodologies to China. A succinct representation of the shift to Western science in China during the 20th century is Hu’s “New Credo.”

e. Mao Zedong (1893-1976 C.E.)

Mao Zedong was born in a village in Hunan province into a well-to-do farming family. He was influenced by Sun Yat-sen’s calls for a Republic of China and read widely from Western texts, including Darwin, Mill, and Rousseau. When the May Fourth Movement (May 4, 1919) erupted in Beijing as a response to imperialism, Mao started a magazine in Changsha and called for a union of the popular masses, the liberation of women, and a new Chinese nationalism. When the Communist Party was founded in Shanghai in 1921, Mao started a branch in Changsha. From 1923 to 1925, he worked as a member of the Party Committee alongside the KMT (Kuomintang/Nationalists) and even ran the KMT activities in Hunan. After 1927, Mao became commander of the Red Army or People’s Liberation Army. He later became the first Chairman of the Central Committee of the Chinese Communist Party of the People’s Republic of China and the “Father of the Nation.”

The complete collected works of Mao from 1917-1945 are available in English at the U. S. Government’s Joint Publications Research Service, where all articles signed by Chairman Mao individually or jointly, as well as those unsigned but verified as his, are available http://marxists.org/reference/archive/mao/works/collected-works-pdf/index.htm. For works after 1945, Selected Works of Mao Zedong (1968) is also a good source for his work: http://www.marxists.org/reference/archive/mao/selected-works/index.htm.

While some question Mao’s credentials as a philosopher, actually he did educate himself extensively with regard to Chinese history and philosophy. Of course, Mao’s concerns are directed into a relatively narrow range of philosophical inquiry: specifically, social, political, and economic thought.

6. Whither China? Philosophical Views

a. Kang Xiaoguang (b. 1963 C.E.)

Kang Xiaoguang has taken up the challenge to offer a political philosophy for China’s post-Mao years in several works. A good overview of his views in English is David Ownby’s “Kang Xiaoguang: Social Science, Civil Society, and Confucian Religion.” Kang’s principal philosophical claim is that the Chinese Community Party must be Confucianized. He thinks that what remains of Marxism in Chinese socio-political ideology of the Party should be replaced with a reconstituted and adapted version of the philosophies of Confucius and Mencius. In his program, while the educational system will be kept within the party schools, their syllabi should be changed, listing the Four Books and Five Classics as required courses of study. He calls for a return to the examination system for all promotions within the bureaucracy and argues that Confucian philosophical teachings should be a major component of each examination. Moreover, he also maintains that not merely the political system of China, but also the society must be Confucianized. Kang holds that only by introducing Confucianism into the national education system can China regain its value system, as well as possess again a faith and soul for its culture. In his view, this can be achieved only if Confucianism becomes the state’s civil value system.

b. Tu Wei-ming (1940-) and New Confucianism

The New Confucian Movement is a complex and overlapping group of scholars from mainland China to the U. S. A. One thinker who is contributing to this movement is Tu Wei-ming (Du Weiming, 1940-). Having taught and written for many years in the U. S. A., Tu became the founding Dean of the Institute for Advanced Humanistic Studies at Beijing University in 2010. A five-volume anthology of his works was published in Chinese in 2001. One representative example of his work is the essay entitled “Beyond the Enlightenment Mentality: A Confucian Perspective on Ethics, Migration, and Global Stewardship.”

7. References and Further Reading

(Formater: Insert paragraphs for this section here.)

  • Ames, Roger T. and Henry Rosemont, Jr., trans. The Analects of Confucius: A Philosophical Translation. New York: Ballantine, 1998.
  • Bloom, Irene, trans. Completed by Philip J. Ivanhoe. Mencius. New York: Columbia University Press, 2009.
  • Brooks, E. Bruce and A. Taeko, trans. The Original Analects: Sayings of Confucius and His Successors. New York: Columbia University Press, 1998.
  • Chan, Wing-tsit, trans. A Sourcebook in Chinese Philosophy, 4th ed. Princeton: Princeton University Press, 1963.
  • Csikszentmihalyi, Mark, trans. “The Way of the King Joins the Three” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, trans and ed., Justin Tiwald and Bryan W. Van Norden (Indianapolis: Hackett, 2014), 15-18.
  • Forke, Alfred, trans. Philosophical Essays of Wang Ch’ung. London: Luzac, 1907. Available online at http://www.humanistictexts.org/wangchung.htm.
  • Fung, Yu-lan, trans. and ed. A History of Chinese Philosophy, 2 vols. Princeton: Princeton University Press, 1953.
  • Hu Shi. “My Credo and Its Evolution,” in Living Philosophies: A Series of Intimate Credos, Leaach Henry Godddardv, ed. (New York: Simon and Schuster, 1931), 235-63.
  • Hutton, Eric, trans. “Xunzi” in Readings in Classical Chinese Philosophy, eds. P.J. Ivanhoe and Bryan Van Norden (Indianapolis: Hackett Publishing, 2001), 255-311.
  • Hutton, Eric, trans and ed. Xunzi: The Complete Text. Princeton: Princeton University Press, 2014.
  • Ivanhoe, Philip J., trans. “Cheng Hao, Selected Sayings” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, Justin Tiwald and Bryan Van Norden, eds. (Indianapolis: Hackett, 2014), 143-152.
  • Ivanhoe, Philip J., trans. “Cheng Yi, Selected Sayings” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, Justin Tiwald and Bryan Van Norden, eds. (Indianapolis: Hackett, 2014), 158-168.
  • Ivanhoe, Philip J., trans. The Daodejing of Laozi. New York: Seven Bridges Press, 2002.
  • Ivanhoe, Philip J., trans. “The Platform Sutra of the Sixth Patriarch” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, Justin Tiwald and Bryan Van Norden, eds. (Indianapolis: Hackett, 2014), 91-98.
  • Ivanhoe, Philip J., trans. “A Record of Practice” in Readings from the Lu-Wang School of Neo-Confucianism, Philip J. Ivanhoe, trans. and ed. (Indianapolis: Hackett Publishing, 2009), 131-160.
  • Ivanhoe, Philip J. “Whose Confucius? Which Analects?” in Confucius and the Analects: New Essays, ed. Bryan W. Van Norden (Oxford: Oxford University Press, 2002), 119-133.
  • Johnston, Ian, trans. The Mozi: A Complete Translation. New York: Columbia University Press, 2010.
  • Kang, Xiaoguang. “Confucianization: A Future in the Tradition,” trans. Huiqing Liu, Social Research 73:1 (2006): 77-120.
  • LaFargue, Michael, trans. The Tao of the Tao-te-ching. Albany: State University of New York Press, 1992.
  • Lau, D. C. trans. Mencius. 2 vols. Hong Kong: Chinese University Press, 1984.
  • Lau, D. C. “On Mencius’ Use of the Method of Analogy in Argument” in Lau, trans., Mencius (London: Penguin Books, 1970), 235-263.
  • Liao, W.K., trans. (1939). Complete Works of Hanfeizi. London: Arthur Probsthain, 1939. http://www2.iath.virginia.edu/saxon/servlet/SaxonServlet?source=xwomen/texts/hanfei.xml&style=xwomen/xsl/dynaxml.xsl&chunk.id=d1.1&toc.depth=1&toc.id=0&doc.lang=bilingual.
  • Littlejohn, Ronnie. Confucianism: An Introduction. London: I.B. Tauris, 2011.
  • Littlejohn, Ronnie. Daoism: An Introduction. London: I.B. Tauris, 2010.
  • Liu, Xiaogan. Classifying the Zhuangzi Chapters. Trans. by Donald Munro. Ann Arbor, Michigan: The University of Michigan, 1994.
  • Major, John, Sarah Queen, Andrew Set Meyer, and Harold Roth, trans. The Huainanzi: A Guide to the Theory and Practice of Government in Early Han China. New York: Columbia University Press, 2010.
  • Mao, Zedong (1917-45). Collected Works of Mao Zedong. US Government’s Joint Publications Research Service. http://marxists.org/reference/archive/mao/works/collected-works-pdf/index.htm.
  • Mao, Zedong. Quotations from Mao Tse Tung. Beijing: Peking Foreign Languages Press, 1966.
  • Mao Tse Tung Internet Archive, http://www.marxists.org/reference/archive/mao/works/red-book/index.htm.
  • Mao, Zedong.《毛泽东選集》Mao Zedong Xuanji, Selected Works of Mao Zedong. Beijing: Renmin Press, 1968. http://www.marxists.org/reference/archive/mao/selected-works/index.htm.
  • Ownby, David. “Kang Xiaoguang: Social Science, Civil Society, and Confucian Religion.” China Perspectives 4 (2009): 101-111.
  • Roth, Harold. “Who Compiled the Chuang-tzu?” in Chinese Texts and Philosophical Contexts, ed. Henry Rosemont. La Salle: Open Court, 1991.
  • Sanderovitch, Sharon, trans. “The Way of the King Joins the Three” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, trans and ed., Justin Tiwald and Bryan W. Van Norden (Indianapolis: Hackett, 2014), 13-15.
  • Shaughnessy, Edward, trans. The I Ching: The Classic of Changes. New York: Ballantine Books, 1997.
  • Slingerland, Edward, trans. Confucius: Analects, with Selections from Traditional Commentaries. Indianapolis: Hackett Publishing, 2003.
  • Suzuki, D.T. and Carus, Paul, trans. Treatise on Response & Retribution, Chicago: Open Court, 1906.
  • Thompson, Laurence, trans. Ta t´ung shu: the One-world Philosophy of K`ang Yu-wei. London: George Allen and Unwin, 1958.
  • Tiwald, Justin, trans. “An Evidential Commentary on the Meaning of Terms in the Mengzi” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, trans. and ed., Justin Tiwald and Bryan W. Van Norden (Indianapolis: Hackett, 2014), 318-337.
  • Tu, Weiming, “Beyond the Enlightenment Mentality: A Confucian Perspective on Ethics, Migration, and Global Stewardship.” International Migration Review 30.1 (Spring 1996), 58-75.
  • Van Norden, Bryan W. ed. Confucius and the Analects: New Essays. Oxford: Oxford University Press, 2002.
  • Van Norden, Bryan W, trans. Mengzi, with Selections from Traditional Commentaries. Indianapolis: Hackett Publishing, 2008.
  • Van Norden, Bryan W., trans. “Categorized Conversation of Zhu Xi” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, Justin Tiwald and Bryan Van Norden, eds. (Indianapolis: Hackett, 2014), 168-184.
  • Van Norden, Bryan W. and Tiwald, Justin, trans. “Explanation of the Diagram of the Great Ultimate” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, trans and ed., Justin Tiwald and Bryan W. Van Norden (Indianapolis: Hackett, 2014), 136-140.
  • Watson, Burton, trans. The Complete Works of Chuang Tzu. New York: Columbia University Press, 1968.
  • Wei, Tat, trans. Ch’eng Wei-shi lun (Doctrine of Mere-Consciousness by Husan Tsang). Hong Kong: The Ch’eng Wei-shi lun Publication Committee, 1973.
  • Zhang Dongsun, “A Chinese Philosopher’s Theory of Knowledge” in Our Language and Our World: Selections from Etc.: A Review of General Semantics, ed., S.I. Hayakawa and trans., Li An-che (New York: Harper, 1959), 299-324.

 

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Plato: The Laws

The Laws is Plato’s last, longest, and, perhaps, most loathed work. The book is a conversation on political philosophy between three elderly men: an unnamed Athenian, a Spartan named Megillus, and a Cretan named Clinias. These men work to create a constitution for Magnesia, a new Cretan colony. The government of Magnesia is a mixture of democratic and authoritarian principles that aim at making all of its citizens happy and virtuous.

Like Plato’s other works on political theory, such as the Statesman and the Republic, the Laws is not simply about political thought, but involves extensive discussions on psychology, ethics, theology, epistemology, and metaphysics. However, unlike these other works, the Laws combines political philosophy with applied legislation, going into great detail concerning what laws and procedures should be in Magnesia. Examples include conversations on whether drunkenness should be allowed in the city, how citizens should hunt, and how to punish suicide. Yet, the legal details, clunky prose, and lack of organization have drawn condemnation from both ancient and modern scholars. Many have attributed this awkward writing to Plato’s old age at the time of writing; nonetheless, readers should bear in mind that the work was never completed. Although these criticisms have some merit, the ideas discussed in the Laws are well worth our consideration, and the dialogue has a literary quality of its own.

In the 21st century, there has been a growing interest among philosophers in the study of the Laws. Many of the philosophical ideas in the Laws have stood the test of time, such as the principle that absolute power corrupts absolutely and that no person is exempt from the rule of law. Other significant developments in the Laws include the emphasis on a mixed regime, a varied penal system, its policy on women in the military, and its attempt at rational theology. Yet, Plato took his most original idea to be that law should combine persuasion with compulsion. In order to persuade citizens to follow the legal code, every law has a prelude that offers reasons why it is in one’s interest to obey. The compulsion comes in the form of a punishment attached to the law if the persuasion should fail to motivate compliance.

In addition, in the Laws Plato defends several positions that appear in tension with ideas expressed in his other works. Perhaps the largest difference is that the ideal city in the Laws is far more democratic than the ideal city in the Republic. Other notable differences include appearing to accept the possibility of weakness of will (akrasia)—a position rejected in earlier works—and granting much more authority to religion than any reader of the Euthyphro would expect. By exploring these apparent differences, students of Plato and the history of philosophy will come away with a more nuanced and complex understanding of Plato’s philosophical ideas.

Table of Contents

  1. Setting and Characters
  2. The Laws, Customs, and Political Structure of Magnesia
  3. The Relationship between the Laws and the Republic
  4. Overview of the Laws
  5. Book 1 and 2
    1. Virtue
    2. Education and Moral Psychology
    3. Happiness and Virtue
    4. Symposium
  6. Book 3
    1. The Origin of Legislation
    2. Sparta
    3. Persia and Athens
  7. Book 4
    1. Geography of Magnesia
    2. Colonists and Legislation
    3. Preludes
  8. Book 5
    1. Ethics
    2. Geography and Population
  9. Book 6
    1. Voting and Offices
    2. Marriage
  10. Book 7 and 8
    1. Musical Education
    2. Gymnastics
  11. Book 9
    1. Responsibility
    2. Punishment
  12. Book 10
    1. Atheism
    2. Deism and Traditional Theism
  13. Book 11 and 12
    1. Laws
    2. Nocturnal Council
  14. References and Further Reading
    1. Standard Greek Texts
    2. English Translations
    3. General Discussions and Anthologies
    4. Culture, Laws, and Context
    5. The Preludes
    6. Ethics, Moral Psychology, and Political Thought
    7. Theology

1. Setting and Characters

The dialogue is set on the Greek island of Crete in the 4th century B.C.E. Three elderly men are walking from Cnossos to the sacred cave and sanctuary of Zeus located on Mount Ida. This setting is crucially linked to the theme of the Laws. These three men are walking the path that Minos (a legendary lawgiver of Crete) and his father followed every nine years to receive the guidance of Zeus. As these men trace Minos’ steps, they seek to discover what the best political system and laws are. Like Minos, they too will found their political system on their understanding of the gods.

Each man is from a different Greek city-state (polis). Clinias is from Cnossos, Crete; Megillus is from Sparta; and the unnamed individual is from Athens. There is some speculation as to who this unnamed Athenian might be. Aristotle (Politics 2.6.1265a) thinks he is Socrates. Cicero  (Laws 1.5.15) holds that he is Plato himself, while others speculate that he is supposed to remind the reader of the Athenian politician Solon. Another interpretation holds that the Athenian is unnamed because Plato doesn’t intend for him to represent any particular historical figure.

Setting aside the issue of who the Stranger is, readers might wonder whether they should interpret his views as Plato’s own. There is no easy and uncontroversial answer to the question. Indeed, it is a problem that pervades all of Plato’s work. Scholars adopt a variety of approaches towards this issue. Some scholars take the protagonist to represent Plato’s own view, while others hold that Plato’s view isn’t identified with any single character, but is found in the overall discussion indirectly. Furthermore, some interpreters maintain that Plato intentionally leaves his direct voice out of the dialogues because he isn’t interested in putting forth specific theses, but rather, is interested in generating thought about a set of related questions.

Although Spartans, Cretans, and Athenians are unified in the sense that they are all Greek, they differ culturally. Spartans and Cretans are from the Dorian ethnic group, while Athenians are Ionian. This is relevant for two reasons. First, the Ionians and Dorians have not always been on friendly terms. Indeed, this conflict culminates in the Peloponnesian war (431-404 BC). Second, Dorians are stereotyped as having an exclusive military focus and a distaste for intellectual pursuits, while Athenians are seen as being more artistic and philosophical. Both of these features will play out in the drama of the dialogue as each interlocutor will defend views characteristic of their home institutions and will behave in ways that are stereotypical of their culture.

2. The Laws, Customs, and Political Structure of Magnesia

Magnesia, the theoretical colony of Crete that is developed in the Laws, is a self-sufficient agricultural state located nine to ten miles from the sea. Its remote location will deter the influence of visitors, who might corrupt the culture of Magnesia. That being said, Magnesia will have a population of slaves and foreigners who carry out necessary tasks forbidden to citizens, such as trading and menial labor. The city will consist of 5,040 households. The Athenian is adamant about this number because it is divisible by any number from 1 to 12 (with the exception of 11), making it convenient for purposes of administration. Each household will be allotted to plots of land (one near the city center and one located further away) and these plots of land are inalienable to the holder’s family. The intention is to prevent members of the community from becoming wealthy at the expense of other citizens. Indeed, the city is designed in such a way to prevent citizens from becoming extremely wealthy or poor. Nevertheless, there will be four property classes based on the wealth one’s family accumulated before coming to Magnesia. Although the land will not be farmed in common, it is to be considered a part of the common property, and shareholders must make public contributions. Women will not be allowed to own property, but will be considered citizens and can hold political office. In fact, women are able to participate in the military as soldiers and can attend their own private common meals—two practices usually reserved for men in ancient Greece.

The political system of Magnesia will be mixed, blending democratic and authoritarian elements. This can be seen in how political offices are handled. There are a vast number of different political offices in Magnesia, some of which will be made up of the general citizen body. The benefit of this is that it will make the citizens feel that they have a stake in Magnesia. However, at the same time, there will be particular offices made up of more elite citizens. For example, the “guardians of the law” will supervise the general citizen body. In order to ensure that the guardians of the law are accountable for their conduct, there will be a powerful board of “scrutineers” that provide a check on their authority. The most distinguished office is the “nocturnal council,” which will be in charge of researching the philosophical nature of law and offering insight into how these features can be applied in Magnesia.

3. The Relationship between the Laws and the Republic

Although the Republic and the Laws share many similarities, those who come to the Laws after reading the Republic will likely be surprised at what they find insofar as these texts differ with respect to both content and style. In terms of style, the Laws has far less literary quality than Plato’s masterpiece, the Republic. This is partly a result of the fact that the Laws deals with the details of legal and governmental policies, while the Republic doesn’t; rather, the Republic focuses on politics and ethics at a much more general level. Furthermore, unlike Plato’s other works, the character Socrates is noticeably absent in the Laws.

Turning now to content, in the Republic, Socrates develops an ideal city, referred to as the Callipolis (literally, the beautiful or noble city). The Callipolis consists of three classes: a large working class of farmers and craftspeople, an educated military class, and a small number of elite philosophers who will rule the city. The military and ruler classes are called “guardians,” and they will not have any private property. Indeed, they will hold everything in common including women, men, and children. Unlike in the Callipolis, private property is allowed throughout Magnesia and political power spreads throughout the city. Another notable difference is that only philosophers possess fully-developed virtue in the Republic (and in the Phaedo) while in the Laws the Athenian says that correct legislation aims at developing virtue in the entire citizen body (1.630d-631d, 4.705d-706a, 4.407d, 6.770c, 12.962b-963a). To be sure, the political structure of the Callipolis secures the correct behavior of all citizens. However, because complete virtue involves knowledge, which only philosophers have, non-philosophers can only approximate virtue. In other words, the Laws seems to express more optimism than the Republic with respect to the average citizen’s ability to be virtuous.

This leaves readers to wonder what could explain these apparent differences. Although many different answers have been presented, the most prevalent answer is that the texts were written for two different purposes. The Republic represents Plato’s ideal vision of a political utopia, while the Laws represents his vision of the best attainable city given the defects of human nature. Aristotle, for example, holds that the Republic and the Laws share many of the same features, but that the Laws offers a system that is more capable of being generally adopted (Politics 2.6.1265a-b). Many scholars have supported this reading by pointing out that Magnesia is said to be the second best city, with the ideal city being one in which women, children and property are held in common (Laws 5.739a-740a). Additionally, this interpretation explains why the Laws goes into greater detail concerning day-to-day activities than the Republic does. Because the Callipolis is an unattainable utopia, there is no point to discussing the customs in any sort of detail, but because Magnesia is attainable, this is a worthwhile project. Trevor Saunders captures the essence of this interpretation when he says, “The Republic presents merely the theoretical ideal…The Laws describes, in effect, the Republic modified and realized in the conditions of this world” (1970, 28).

An alternative answer is that Plato changed his mind. On this reading, the views defended in the Laws are an advancement on the ideas expressed in the Republic. This reading denies that 5.739a-740a provides support for the claim that the Callipolis is the ideal city. Strictly speaking, the passage only says that the ideal city is one where everything is held in common, and in the Callipolis only the guardians hold things in common. This lends credence to thinking that the ideal city described in the Laws is not the Callipolis. Christopher Bobonich (2002) has argued that this new perspective is the result of Plato changing his mind about psychology, abandoning the view of the Republic in which the soul has parts and replacing it with a more unified conception of human agency and motivation. However, readers should note that this is merely a cursory discussion of a very large and important issue—there are many other ways to account for the differences between the texts.

4. Overview of the Laws

The Laws is made up of twelve books. Books 1 and 2 explore what is the purpose of government. This exploration takes the form of a comparative evaluation of the practices found in the interlocutors’ homelands. Through the course of this discussion, a preliminary account of education and virtue is offered. Book 3 examines the origins of government and the merits of different constitutions. At Book 3’s conclusion, it is revealed that Clinias is in charge of developing a legal code for a new colony of Crete, Magnesia. After discussing the appropriate population and geography of Magnesia, Book 4 analyzes the correct method for legislating law. Book 5 begins with various moral lessons and then shifts to an account of the correct procedure for founding Magnesia and distributing the land within it. Book 6 presents the details of the various offices and legal positions in Magnesia and ends by examining marriage. Book 7 and 8 discuss the musical and physical education of the citizens. Book 8 concludes with a discussion of sexuality and economics. Book 9 introduces criminal law and analyzes what factors should be taken into account when determining a punishment. Book 10 examines laws concerning impiety and presents an account of theology. Book 11 and 12 continue with the legal code. The Laws ends with an account of the “Nocturnal Council,” the “anchor” of the city.

5. Book 1 and 2

a. Virtue

The dialogue begins with the Athenian inquiring into the origin of law, as to whether it comes from a divine or human being. Clinias states that Apollo is credited as the originator of Crete’s laws, while Zeus is credited as the founder of Sparta’s (624a-625a). The conversation shifts to the question of the purpose of government. Megillus and Clinias hold that the goal of government is to win in war, since conflict is an essential condition of all human beings (625ca-627c). Because the fundamental goal is victory in war, Clinias and Megillus maintain that the primary purpose of education is to make citizens courageous. The Athenian responds by pointing out that reconciliation and harmony among warring parties is superior to one group defeating another. This demonstrates that peace is superior to victory (627c-630d). Consequently, the educative system should not focus exclusively on cultivating courage in its citizens, but should develop virtue in its entirety, including not only courage but wisdom, moderation and justice as well (630d-631d). Indeed, courage, the Athenian argues, is the least important virtue (631d). The goal of law is to help its citizens flourish, and the most direct route to this is developing virtue in them.

It is during this discussion that the Athenian makes an important distinction between “divine” and “human” goods. Divine goods are the virtues, whereas human goods are things like health, strength, wealth, and beauty. Divine goods are superior to human goods in that human goods depend on divine goods, but divine goods do not depend on anything. The idea is that the virtues always contribute to human flourishing, but things that are commonly thought to do so, such as wealth and beauty, will not do so unless one possesses virtue. In fact, things like beauty and wealth in the hands of a corrupt person will enable him or her to act in ways that will lead to failure.

Now that the importance of virtue is established, the Athenian challenges his interlocutors to identify the laws and customs of their home cities that develop virtue.  Megillus easily identifies the Spartan practices that cultivate courage. The Spartan’s educational method primarily focuses on exposing citizens to fear and pain so that they might develop a resistance to each (633b-c). The Athenian responds by pointing out that this practice does nothing to develop the resistance to desire and pleasure. He argues that the Spartans only have partial courage because complete courage involves not only overcoming fear and pain, but desire and pleasure as well (633c-d).

This leads to an inquiry into what customs Sparta and Crete have for developing moderation. Megillus expresses uncertainty, but suggests it likely has to do with gymnastics and common meals (essentially an all-male club with a military emphasis). The conversation becomes contentious as the Athenian says that these practices are the cause of the Dorian’s reputation for pederasty, homosexuality, and the vicious pursuit of pleasure (636a-e). (To see Plato express an alternative attitude towards these practices, readers should turn to the Phaedrus and Symposium.) Megillus defends the nobility of the Spartans, proclaiming that they do not get drunk and that they would beat any drunkard they encountered even if it were during the festival of Dionysus (636e-637a). The Athenian thinks this is bad practice, because under the appropriate conditions intoxication can help one cultivate moderation and courage.

In having the characters put forth the particular positions that they do, Plato is asking us to reflect on the way in which political institutions shape citizens’ values. For instance, Clinias and Megillus, who both come from cultures that center on the military, hold that human conflict is a fundamental part of human nature and courage is the greatest virtue. In contrast, the Athenian, who comes from a culture of art and philosophy, sees harmony, peace, and leisure as ideal. Hence, in order for citizens to cultivate the appropriate dispositions, it is essential that the city have the correct policies and that citizen receive the correct education.

b. Education and Moral Psychology

In defense of moderated intoxication, the Athenian offers an account of education and moral psychology. By education, the Athenian does not mean technical skills, but rather things that direct one towards virtue. The bulk of education is meant to instill the appropriate feelings in citizens so that they feel pleasure and pain with respect to the appropriate things. Just as the Spartan practice of exposing citizens to fear and pain can help cultivate the appropriate feelings with respect to pain, drinking parties can help citizens develop the appropriate feelings with respect to pleasure. The idea being that one can learn to resist negative pleasures and desires only by being exposed to these things. Supervised drinking parties provide a safe and inexpensive way to do this.

Megillus and Clinias are quite skeptical and ask the Athenian to explain how wine affects the soul. It is here that we get an account of moral psychology (644c-645c). The Athenian asks us to imagine a puppet made by the gods with various cords in it. These cords, which represent affections (pleasure, pain, and the emotions) in the soul, pull the puppet in various directions. One cord is sacred and golden. This cord represents reason or calculation and when one follows it, one is virtuous. However, because reason/calculation is soft and gentle it requires the assistance of the other cords (which are hard and violent) to move the puppet in the correct way. The general idea is that virtue not only requires reason/calculation, but also the cultivation of the correct feelings.

The puppet metaphor raises a number of philosophical issues surrounding strength of will (enkrateia) and weakness of will (akrasia). Roughly put, weakness of will is when one intellectually grasps that one should do a certain action, but one’s emotions and desires overrule this judgement, leading to ethical failure. Strength of will is the contrary phenomenon. Like the weak-willed person, the strong-willed person desires to do other than what they intellectual judge they should do. Unlike the weak-willed person, the strong-willed person overcomes these desires and behaves correctly. In the Protagoras (352a-c), Socrates denies the possibility of weakness of will and in the Republic the virtuous agent is not the strong-willed individual who overcomes contrary emotions, but one whose psychic forces exist in perfect harmony. On the face of it, the puppet metaphor raises trouble for both of these commitments. It presents a problem for the former because it suggests that the pull of reason/calculation can be overcome by the emotions (the hard and violent cords) (see also 3.689c and 9.734b). However, this interpretation does face the problem in that the cord called reason/calculation in the metaphor is itself described as an emotion/force, which raises doubts that Plato’s intent is to draw a contrast between reason and the emotions.

The puppet metaphor also raises problems for the view that virtue is harmony because virtue in the puppet metaphor involves mastering the pull of contrary cords. This suggests that virtue amounts to being strong-willed. However, in Book 2 the Athenian describes virtue as the agreement between pleasure and pain and the account that one grasps or reason (653a). This description is in line with thinking that virtue is a harmony in the soul between the different psychic forces.

Another issue disputed by scholars is whether the soul in the puppet metaphor consists of three parts as it does in the Republic. In the Republic (see also, the Phaedrus 246a-254e), the three parts of the soul are: the reasoning/calculating part, the spirited part, and the appetitive parts. Some scholars defend a continuity between the Laws and the Republic, while others argue that the metaphor suggests a bipartition between the rational and non-rational. In other words, in the Laws, the non-rational part of the soul subsumes both the appetitive and the spirited part. Additionally, other scholars have argued that in the Laws, Plato no longer treats the soul as having parts, but more as a unitary agent with different forces in it.

c. Happiness and Virtue

Book 2 continues the discussion surrounding drinking parties and education. Musical education forms the foundation of one’s character because it is through song and dance that one cultivates the appropriate affective responses (654a-d). By taking pleasure in virtuous actions depicted in song and dance, one begins to cultivate virtue (655d-655b). The contrary is true too, one will cultivate vice, if one takes pleasure in vicious actions depicted in song and dance (655b-656b). Because of this, it is paramount for the legislature to establish what music should be allowed in the city—a task that the Athenian believes is best handled by the elderly given their wisdom (658a-e).

One of the most important things music should teach is that justice produces happiness, while injustice produces unhappiness (660b-664b). Clinias and Megillus are skeptical about the connection between virtue and happiness. Clinias will concede that an unjust person lives shamefully, but does not think they live an unsuccessful life if they have wealth, strength, health, and beauty (661d-662a; compare Gorgias 474c-475e). The Athenian will respond by offering four arguments for why it is necessary that the legislators teach that happiness is linked to justice. The first argument is that a legislator who does not teach this to the citizens is sending contradictory messages (662c-663a). On the one hand, the legislators are telling citizens that they should be just so that they may live a good life, but, on the other hand, they are teaching them that they will be deprived of a benefit—namely, pleasure—by living justly. The second argument is that a legislator who does not teach this will find it impossible to persuade the citizens to be just (663b-c. The third argument is that the statement is true—justice is linked to happiness (663c-d). The fourth argument is that even if the doctrine were not true, it ought to be taught anyways because of the social benefits that it provides (663d-e).

d. Symposium

Having secured the importance of teaching the connection between justice and happiness, the Athenian continues his discussion of symposium. He explains that drinking parties and drunkenness should be reserved for citizens in mid-to-late adulthood and must be supervised by a wise leader. The young have lots of energy and are already eager to participate in musical education. Thus, participating in drinking parties would overstimulate the youth and would lead to negative consequences. However, as one ages, one grows despondent and less interested in song and dance. Thus, drinking parties will return older adults to a youthful state in which they are more eager to participate in musical education (671a-674c).

6. Book 3

Book 3 surveys the success and failures of different political constitutions throughout history. Readers should bear in mind that the historical accounts given by Plato are not entirely accurate, but are rather being used to illustrate certain philosophical points.

a. The Origin of Legislation

The Athenian begins by talking about the traditional idea that developed culture is repeatedly annihilated by a great flood. From this flood emerged a primitive culture. During this time life was simple and peaceful. Because there were so few people, individuals were delighted to see each other and resources were abundant (678e-679a).   Despite not having any formal law, people lived according to a political system called autocracy or dynasty (680b). In this system the eldest ruled, with authority being passed down through one’s parents.

Eventually, small clans merged together and formed cities. Once this happened, conflict arose because there were different elders, each  claiming to having authority. In addition, each clan brought with them different religious customs. From this conflict, legislation arose (681c). Individuals were selected to represent the interests of the various clans that comprise the city. These representatives spoke to the respective leaders of the about what rules should be adopted (681c-d).

From these digressions into the origin of legislation three lessons can be drawn. First, cities and civilization are a natural development. The Athenian is rejecting the idea that the city and law are unnatural (see 10.888e-890a; Protagoras 320d-322d; Republic 358b-359b). Second, humans are not naturally opposed to one another as Clinias suggested in Book 1, but share mutual goodwill. Third, a necessary feature of legislation is the reconciliation of conflicts of interest (see Stalley 1983, 71-2).

b. Sparta

After discussing the rise and fall of Troy, the Athenian turns to the history of the three allied Dorian states of the Peloponnese: Sparta, Argos, and Messene. The leaders and citizens of each state bound each other to oaths to respect each other’s rights and to come to each other’s aid if they should be threatened. However, the allegiance dissolved with only Sparta surviving the fallout with any kind of success. Why did the allegiance fail? The Athenian asserts that it was the result of a type of ignorance that is the discordance between one’s emotions and one’s judgments (689a-c). From this, it is agreed that no citizen who suffers this ignorance should have any degree of power (689c-e). This returns us to the discussion of education in Books 1 and 2, where we are told that in order for a city to flourish its citizens must cultivate the appropriate affective responses.

Argos’ and Messene’s respective leaders suffered from this type of ignorance and the negative consequences of this were exacerbated by the fact that they had absolute power (690d-691d). Sparta, in contrast, was safeguarded from disaster because it distributed political power between multiple actors (or positions of power), including two kings (rather than one), a council of elders, and officials chosen by lot (called ephors) (691d-692bc). Here, the Athenian is introducing the key political idea that a successful constitution will distribute power by mixing various ruling elements.

c. Persia and Athens

Having described a moderate political system in Sparta, the Athenian discusses two states that stand as opposites to each other: Athens and Persia. Athens represents the extreme democracy and Persia the extreme monarchy. According to the Athenian, Persia fluctuated between periods of success and failure. Under the ruler of Cyrus, there was a balance of freedom and subjection. Soldiers were granted freedom of speech and the king took council from wise citizens. The result was that the soldiers had positive feelings towards their leaders and the state was guided in a wise direction (694b-c). However, upon the death of Cyrus, disaster ensued. Cyrus’ sons were raised in luxury and were never properly educated (694c-b). Instead of blending freedom and subjection as their father did, his sons were violent and demanded submission (695b). Eventually, Darius took control of the empire and this process repeated itself. Darius salvaged the empire by embracing freedom and subjection, but when his pampered son, Xerxes, took over, the empire suffered (695d-e).

According to the Athenian, the history of Athens is very much the opposite of Persia. If Persia failed because its rulers did not grant enough freedom, Athens failed because it granted too much. When the Persians attacked the Greeks, out of fear and necessity the Athenians lived according to certain honor codes that bound the community together. During this time, Athenians would voluntarily submit themselves to authority and because of this Athens was successful in its defense (698b-700a). However, once the threat from Persia was gone, the fear and honor codes that held the community together and naturally restricted freedom, left as well. Athenians began to consider themselves as the authority on various matters and let pleasure guide them. This resulted in a community of ignorance and excess (700a-701d).

The Athenian’s point is two-fold. First, if a political system is to succeed it must be a mixture of subjection and freedom. It must grant enough freedom such that citizens are not oppressed and do not resent the leaders, but follow them willingly. Indeed, the political system should be concerned about the welfare of the entire citizen body. Nevertheless, a political system must grant authority only to those who are wise since the masses will simply pursue what they find most pleasant. Hence, there must be some restrictions on the freedom of citizens. Second, the only way to consistently achieve a balanced political system is if the citizens receive a proper education.

7. Book 4

a. Geography of Magnesia

At the end of Book 3, Clinias reveals that he is one of ten Cretans assigned to compose a legal code for a new colony, Magnesia. Book 4 begins the construction of this new colony. Magnesia will be located on an isolated Cretan island, roughly nine or ten miles inland. Although the terrain is rough, the land has many resources. The Athenian is pleased to find this out because it means that Magnesians will not require a significant amount of trading with different communities. This is beneficial because it will restrict foreign influence on the city (704a-705b).

b. Colonists and Legislation

Colonists will mostly come from Crete, though individuals from the greater Peloponnese will be welcome as well. Initially, this poses a problem. Magnesia will consist of individuals with different cultural customs, so how can these be reconciled under a single system of law? The Athenian’s solution at this stage of the argument is that a moderate dictator and a wise legislator should develop the legal code and constitution (709a-710e). The advantage of a dictatorship is that the laws and customs can easily be altered since power is located in one individual. It should be noted that after the dictator and legislator create the legal code, power will be transferred to various officials.

The next project is to describe what constitution this benevolent dictator will create. No straight answer is given, instead the Athenian proceeds to offer a myth of life during the time of Cronos (Zeus’ father). The myth explains that during Cronos’ rule, life was blessed and happy. Cronos, knowing that human nature is corrupt, put divine beings in charge of humans. This is similar to how humans rule over farm animals. The lesson is that one should not be ruled by one’s equal, but by one’s superior. The Athenian explains that although Cronos’ reign is over and divine beings no longer guide us, within human beings is a divine element, namely, reason. By following reason, the laws will mirror the divine rule that occurred during the time of Cronos and humans will be happy (713c-714a). This myth connects the reader back to the initial topic of the Laws, which concerns the connection between law and the divine. The Athenian is explicitly linking together reason, law, and the divine.

From the myth of Cronus, it is clear that the law should be rational, but who should it serve and where does its authority lie? The Athenian maintains that any law that does not serve the interest of the whole city is a bogus law (715b). For this reason, those who hold political positions will be called servants of the law rather than being called rulers. Since the law is connected to the divine, those who serve the interests of the city are really serving the gods (715c-d). From this it is clear that the law is to have authority over all citizens and that the law is fundamentally concerned about the welfare of the whole community and not any particular group or individual.

c. Preludes

The initial framing of the laws comes directly from the legislator and the dictator. The Athenian remarks that this is the best and most efficient means to establishing good laws in the city. But if law comes entirely from the outside, why would a citizen follow it willingly? How is the Athenian not simply making the same mistake he accused the Persian leaders of making? The Athenian solves this problem by inventing the idea of a prelude in law.

He begins his explanation with a medical analogy in which he compares the medical practices of a free doctor with that of a slave doctor (720a-720e). The doctors differ in terms of whom they treat and how they treat them. The slave doctor primarily treats slaves and acts like a tyrant—simply issuing commands and forcing his patients into obedience. In contrast, the free doctor primarily treats free people and is attentive to his patients before he issues prescriptions. In fact, the free doctor will offer no prescription until he has persuaded his patient about what is the correct medical procedure. The slave doctor is like a tyrant, relying solely on compulsion; in contrast, the free doctor utilizes both persuasion and compulsion. The Athenian wants the legislator to be like the free doctor, using both persuasion and compulsion.

Persuasion is achieved by attaching preludes to the law. In musical compositions, preludes are brief musical performances that precede the main composition. Musical preludes are designed to complement the forthcoming performance so that it is better received by the audience. Similarly, the legislator can preface the law with brief statements that will make the citizens more cooperative and ready to learn, and thus more likely to accept the laws freely (722d-723a). Compulsion is achieved by attaching penalties to the law if citizens should choose not to comply.

The Athenian clearly wants citizens to obey the law voluntarily. He realizes that in order for this to happen the citizens must see the law as serving their interests and the preludes are meant to accomplish this. But what is the nature of the persuasion underlying the preludes? There are three main interpretations. The first interpretation is that the persuasion is rational. Defenders of this view maintain that the point of the preludes is to explain to citizens the actual reasons that underlie the law. The evidence in favor of this reading is mainly found in how the Athenian describes the preludes. When discussing the preludes, the Athenian repeatedly says that they involve teaching, learning, and reason (4.718c-d, 4.720d, 4.723a, 9.857d-e, 9.858d, and 10.888a). If this interpretation is correct, then the Laws presents a much more optimistic view of the average citizen than the Republic does. In the Republic, farmers and artisans do not receive philosophical training, but on this reading the citizens of Magnesia will come to grasp some of the underlying philosophical reasons behind the law.

The second interpretation holds that the persuasion is non-rational and does not appeal to citizens’ reason, but rather their emotion. The main evidence in support of this reading is found in the preludes themselves. Many (though not all) of the preludes are like conventional sermons, merely shaming the citizens into obedience. A favorite example of those who support the non-rational reading is the prelude to hunting laws. In this prelude, the Athenian simply asserts that only hunting land animals with horses, dogs, or on foot is worthy of courage, and that other forms of hunting such as trapping, are lazy and should not be done (7.823d-824b; see also 5.726a-734e, 6.772e-773c, 9.854b-c, 10.904e-905c, and 11.927a-d). The Athenian makes no attempt to explain why some forms of hunting are lazy, while others are courageous, nor does he explain why a lazy form of hunting is bad and not simply an efficient use of one’s time.

The third interpretation lies in the middle of the first two, it attempts to reconcile the rational and non-rational readings. Suppose that the preludes are described by the Athenian as appealing to reason and suppose that the actual preludes do not appeal to reason, but instead emotion. What could explain this inconsistency? Two answers present themselves and represent the main readings that could be classified as being in the middle. The first is that the Stranger is using the description of the preludes to offer an ideal of law according to which the citizens freely and rationally obey the law. However, due to the psychological limitations of humans, the actual preludes will not live up to this ideal. The second answer is more pragmatic. The Athenian wants citizens to be motivated to obey the law. He recognizes that citizens will be diverse in both their interests and intellectual abilities. Because of this, the lawgiver will have to appeal to different types of things in order to motivate citizens, some being rational, while others being non-rational.

8. Book 5

a. Ethics

Having explained the concept of a prelude, the Athenian proceeds to offer a prelude which will preface the entire legal code of Magnesia. This prelude provides the moral foundation for the city, explaining the general duties of the citizens. These duties fall under three main headings: to the soul, to the body, and to other citizens. The prelude ends with an attempt to show that the virtuous life leads to the maximum amount of pleasure and the vicious life leads to the maximum amount of pain. Below provides an outline of the main ideas expressed in this section of Book 5.

The Athenian explains that the soul is the master of the body and because of this it should be given priority over the body. Nevertheless, most humans fail to do this, and instead pursue beauty, wealth, and pleasure at the expense of virtue, and as a result, they prioritize the body over the soul (726a-728d). Although humans should prioritize the soul over the body, they are also obligated to take care of their bodies. However, people do not honor the body by being extremely beautiful, healthy, and strong. Rather, they honor the body by achieving a mean between the extremes of each of these states. The same principle applies to wealth. Too much wealth will lead to feuds and greed, while too little wealth will make one vulnerable to exploitation (728d-729a).

Readers might find the idea of honoring the soul and body as being not only mystical sounding, but also wrong. After all, it might be good for me to be physically healthy, but it doesn’t seem like I’m violating a duty if I’m not. However, these oddities can be explained away if we consider three things. First, the Athenian’s division between honoring the soul and honoring the body maps on to the distinction he articulated in Book 1 between divine and human goods. Humans honor the soul by pursuing virtue. This is a divine exercise because the soul itself is divine (726a). Although the religious connection is important for Plato, this distinction is really between “internal” and “external” goods. Internal goods are the goods of the mind and character, while external goods are everything that is potentially good that lies outside the mind and character. For Plato, the value of external goods depends on the presence of internal goods, while the value of internal goods in no way depends on the presence of external goods. In other words, internal goods are good in every situation, while external goods are only good in some situations. Because of this, Plato finds it odd that humans devote so much time and energy to pursuing external goods and so little to achieving internal goods.

Second, Ancient Greek ethics is usually interpreted as egoistic in the sense that ethical inquiry centers on the question of what is the best life for an individual. In this framework, discussions about why one should become virtuous are put in terms of how virtue relates to well-being. In other words, the Ancient Greek ethicists argue that we have self-regarding reasons to become virtuous; namely, that virtue will help us live a successful and happy life. With this in mind, it makes sense that Plato would think that we are obligated to care for the soul and body, since the good life requires it.

Third, it is worth bearing in mind that the main ethical theories today have self-regarding features built into them and thus this idea is not entirely unique to Plato (and other Ancient Greek ethicists). The three main ethical theories today are virtue ethics (advocated by Plato), deontology, and consequentialism. Immanuel Kant, the inspiration for deontology, held that we have the duty of self-improvement, while consequentialism, in its most traditional form,  holds that when determining how I ought to act, my own personal welfare is given a consideration.

After expressing that citizens ought to care for others, the Athenian offers a fascinating argument in defense of the virtuous life. The crux of the argument is that vice leads to emotional extremes, while virtue leads to emotional stability. Because emotional extremes are painful, it follows that the virtuous life will be more pleasant (732e-734e).

The Athenian aims to show that the virtuous life will lead to more pleasure than pain. In doing this, he hopes to undermine the all too common thought, that the life of vice, though morally bad, is still enjoyable.

b. Geography and Population

The remainder of Book 5 returns to discussing the structure of Magnesia. This discussion covers a wide array of topics, which include: the selection of citizens (735a-736e), the distribution of land (736c-737d and 740a), the population (737e-738b and 740b-744a), religion (738c-738e), the ideal state (739a-739e), the four property classes (744b-745b), administrative units of the state (745b-745e), the flexibility of the law in light of facts (745e-746d), the importance of mathematics (746d-747d), and the influence of the climate (747d-747e). The main philosophical ideas in this part of the book are covered in sections 3 and 4 above.

9. Book 6

a. Voting and Offices

With the geography and population of Magnesia established, the Athenian begins to describe the various offices in the city and the electoral process (751a-768e). The electoral process is quite complicated and difficult to understand, but typically has four stages: nomination, voting, casting lots, and scrutiny. All citizens who have served (or are serving) in the military will nominate candidates by writing their names on publicly displayed tablets. During this time, they are permitted to erase any names they find unsuitable. The names that appear most frequently will be assembled into a list from which citizens will cast their votes. This process will then repeat; the names of citizens who have the most votes will be assembled into another list. From this list, lots will be drawn to determine who gets the position. If the selected names pass scrutiny, they will be declared elected.

One might wonder what value casting lots adds to the electoral process, especially since the practice is no longer that common. In Plato’s time, casting lots was seen as a democratic process, while voting was seen as being more of an oligarchic process (Aristotle Politics 4.9.1294b8-13). The idea is that if all citizens are equal, then they all equally deserve to hold office; thus, the only fair procedure would be to have the office chosen randomly. To have citizens vote for a candidate, is to admit that some citizens are more qualified than others. Hence, the inclusion of lot casting is a concession to the egalitarian sentiment found in democracies.

This is most clearly seen in the Athenian’s discussion of equality (756e-758). The Athenian distinguishes between two types of equality: arithmetic equality and geometric equality (these are Aristotle’s terms, see Politics 5.1.1301b29-1302a8, Nicomachean Ethics 5.3.1131a25-5.5.1133b28). Arithmetic equality treats everyone as equal and corresponds to the lot, while geometric equality treats everyone based on their nature and abilities and corresponds more closely to voting. The Athenian maintains that geometric equality is the true form of equality since humans have different natures and to treat them as equal is actually a form of inequality. However, most citizens will not see things this way and thus the inclusion of the lot is a way to avoid dissension.

There are various offices described in Book 6, but three are worthy of note: the assembly, the council, and the guardians of the law. The assembly is open to all citizens who are serving or have served in the military. The main function is to elect members of the council and other officials, though there are other functions (753b, 764a, 767e-768a, 772c-d, 8.850b, 11.921e, 12.943c). The council comprises ninety members from each property class, totaling 360 members. The membership lasts one year and the main function is to conduct the day to day business of the state such as supervising elections and organizing the assembly (756b-758d). The guardians of the law are made up of thirty-seven citizens aged at least fifty. They will hold the position for at least twenty years and their primary function is to guard the law (752-755b). They guard the law by supervising both officials and ordinary citizens, by helping resolve difficult judicial cases, and by supplementing and revising the law. Within both the electoral process and the offices held, we see the Athenian’s attempt to develop a constitution that mixes various political elements.

b. Marriage

The conversation abruptly shifts to the topic of marriage and child-rearing, with an aside on slavery. In continuing with his emphasis on moderation and mixed constitutions, the Athenian encourages people to marry partners who have opposite characteristics. Although people are attracted to those who are like them, citizens will be encouraged to put the good of the state above their own preferences. However, because citizens will find such laws to be excessively restrictive, the Athenian only wants to encourage, but not require, citizens to marry people with opposite qualities (773c-774a). If male citizens do not marry by the age of thirty-five, they will be subject to fines and dishonors.

These laws might strike one as rather draconian; nonetheless, one should keep in mind three things. First, the marriage laws in Magnesia are inspired by actual practices in Crete and Sparta. Second, the laws are less severe than the one’s expressed in the Republic in which there is no private marriage for the guardian class (that is, soldiers and philosophers). In the Republic, the guardians will consider each (appropriately aged) person of the opposite sex to be their spouse. Mating will be arranged by using a lottery. However, the lottery is rigged such that a select few will actually be controlling the sexual relationships so as to avoid incest, control the population, and implement eugenics (Republic 5.459d-460c). Of course, Plato does not provide the details of the marriage laws surrounding the working class citizens and for all we know these might have been similar to the ones in Magnesia. Third, for his time, Plato is actually progressive in his views of women. In Book 6, the Athenian advocates for the inclusion of women in the practice of common meals, an inclusion that Aristotle lists as something peculiar to Plato (Politics 2.12.1274b10-11). The Athenian emphasizes that a city cannot flourish unless all citizens receive a proper education.

10. Book 7 and 8

Traditional Greek education involved both musical and gymnastic training. Musical education includes all of the subjects of the Muses, subjects such as music, poetry, and mathematics. Gymnastics is education related to physical activity. It includes things like military training and sports. Books 7 and 8 provide the details of Plato’s account of education, which extends to both males and females. Education, for Plato, mostly comes in the form of play and its importance cannot be overstated. The following passage captures this idea, as well as Plato’s conservatism:

If you control the way children play, and the same children always play the same games under the same rules and in the same conditions, and get pleasure from the same toys, you’ll find that the conventions of adult life too are left in peace without alteration… Change, we shall find, except in something evil, is extremely  dangerous (Saunders trans., 797a-c)

Below is a sketch of the main educative laws and principles.

a. Musical Education

The poetry and theatre allowed in Magnesia will mostly present images and sounds that provide positive moral lessons (814e-816d, 817b-817d). The underlying idea behind these restrictions is that humans will develop characteristics of the people they observe in poetry and theatre. If they see bad people doing well or acting as cowards, they will be more inclined to become bad and cowardly. There is a notable exception, however, in that comedy will be allowed as long as it is performed by slaves or foreigners (816d-e).

The Athenian’s policy concerning musical education extends the views discussed in Books 1 and 2 in two ways. First, the policies reflect the view that the character we develop is largely shaped by what we find pleasurable and painful. The art and entertainment in the city should be such that we take pleasure in good and beautiful things and are pained by bad and ugly things. Second, the inclusion of comedy reflects the lessons of the discussion concerning drunkenness; we can only learn to resist doing shameful behavior if we have some exposure to it.

All Magnesians will learn basic mathematics, with some advancing to study astronomy. This is significant because in the Republic, Plato says that it is through mathematics that we come to learn about non-sensible properties, which are the subject of philosophical thought (7.522c-540b). In the Republic, this study is commonly thought to be reserved for the most elite and talented citizens, while in the Laws a portion of it is given to the entire citizen body. This suggests that, on some level, all Magnesians will have some awareness of philosophy.

b. Gymnastics

Physical education aims at achieving two things: (1) the development of good character traits and (2) military training. Because physical education is meant to provide military training, sports will be modified to emphasize this. For example, impractical and unrealistic techniques will be forbidden (796a, 813e, and 814d) and armed competitions will be emphasized (833e-834a).

It is clear enough how physical education could prepare one for the military, but how does it contribute to one’s character? There are two related ways in which physical movement affects one’s character. First, the Athenian argues that physical movement directly affects one’s emotions. For example, the Athenian insists that fetuses and infants must constantly be moved around so that their excessive fears and anxieties are purged (789b-791d). Another example of this kind of thinking is the Athenian’s claim that a moderate amount of physical hardship is required for children to develop virtue; too much luxury will make one spoiled and lack moderation, but too much hardship will make one misanthropic (791d-794a). Second, the Athenian maintains that humans take on the characteristics of the things that they imitate. Dancers will become graceful and courageous by imitating graceful and courageous movements, while they will become the opposite by imitating the opposite (814e-816e).

11. Book 9

a. Responsibility

In Plato’s so called “early dialogues,” Socrates defends the paradoxical claim that injustice is always involuntary because it is a result of ignorance. The evil doer actually desires what is good, so when they act wrongly, they are not doing what they actually want to do (Protagoras 352a-c; Gorgias 468b; Meno 77e-78b). We can break this paradoxical view into two claims:

Involuntary Thesis: No one is voluntarily unjust.

Ignorance Thesis: All wrongdoing is the result of ignorance.

 In Book 9 of the Laws, Plato will grapple with both claims. On the one hand, the Athenian is adamant that the involuntary thesis is true, but on the other hand, he acknowledges that all lawgivers seem to deny it. Lawgivers treat voluntary wrongdoing as a more severe punishment than involuntary wrongdoing. Moreover, the concept of punishment seems to presuppose that the criminals are responsible for their actions and this seems to presuppose that they act voluntarily when they act unjustly. The Athenian, thus, faces a dilemma: he must either abandon the involuntary thesis or he must explain how the involuntary thesis is able to preserve the underlying thought in law that some crimes are accidental and others are not (860c-861d).

The Athenian refuses to abandon the involuntary thesis and attempts to resolve this difficulty by offering a distinction between injury and injustice. Injury explores what kind of harms were done to the victim and what the criminal owes to the victim, their family, or the state. Injustice explores the psychological conditions under which the crime was committed. He mentions three main conditions: anger (thumos), pleasure, and ignorance (862b-864c).

Although there is much scholarly debate surrounding this issue, the general idea appears to be that a criminal can harm someone voluntarily or involuntarily, but can never be unjust voluntarily. For example, I might intentionally bump my coffee cup so that it spills on your computer or I might accidentally do this. The former is a voluntary harm, while the latter is an involuntary harm. Accordingly, the former should be punished more severally than the latter. Nevertheless, even in the instance when I voluntarily damage your computer, I am not voluntarily unjust. This is because no one desires what is bad for them and injustice is bad for one, so no one desires injustice. If I truly knew what was good or was not overcome by pleasure or anger, I would not engage in vicious behavior because my soul would be just. Thus, Plato wants to preserve the voluntary thesis, while abandoning (or qualifying) the ignorance thesis by allowing for the possibility that anger and pleasure can move one to act unjustly.

Many scholars have pointed out that the Athenian appears to equivocate on the terms “voluntary” and “involuntary.” When discussing voluntary and involuntary harms the terms are used in the ordinary sense, reflecting what an agent actively or consciously desires and wishes. However, when discussing voluntary and involuntary injustice the terms are used in the Socratic sense, reflecting what an agent deeply desires and wishes. Hence, the ordinary sense only refers to conscious psychological states, while the Socratic sense can refer to unconscious states or what is entailed by desiring the good.

In any case, the Athenian’s overall point is clear. Punishment must not simply look to the harm that is caused, but must look to the psychological state under which injury resulted. This has the benefit of allowing for nuance when punishing agents since the degree of culpability can be found in the agent’s psychological state. An agent who deliberates and then kills someone should not be treated the same as someone who kills someone in anger or as the result of some unforeseen accident.

b. Punishment

The Athenian’s distinction between injury and injustice accords with his commitment to punishment as a means of recompense for the victim and as a cure for criminality. The purpose of the former is rather self-explanatory, but more needs to be said about the latter. As the Athenian explained in Book 1, the purpose of legal codes is to make citizens happy. Since, happiness is linked to virtue, the law must try to make citizens virtuous. Seeing punishment as curative is really just an extension of this idea to the criminal. If justice is a healthy state of the soul, then injustice is a disease of the soul in need of curing via punishment. For passages that express this idea, see 5.728c, 5.735e, 8.843d, 9.854d-855b, 9.862d-863c, 11.933e-934c, 12.941d, and 12.957d. Unfortunately, the Athenian never explains how particular punishments will achieve this goal.

One might think that the Athenian’s curative view of punishment results in soft penalties, but this is far from true. Punishment will take six forms: death, corporal punishment, imprisonment, exile, monetary penalties, and dishonors. It is worth pointing out that the use of imprisonment as punishment in Greek society appears to be an innovation of Plato. One might wonder how capital punishment is compatible with a curative theory of punishment. The answer is that some people are beyond cure and death is best for them and the city (862d-863a). For Plato, psychological harmony, virtue, and well-being are all interconnected. Accordingly, the completely vicious who cannot be cured will always be in a state of psychological disharmony and will never flourish. Death is better than living in such a condition.

12. Book 10

Book 10 is probably the most studied and best known part of the Laws. The Book concerns the laws of impiety of which there are three varieties (885b):

Atheism: The belief that the gods do not exist.

Deism: The belief that the gods exist but are indifferent to human affairs.

Traditional Theism: The belief that the gods exist and can be bribed.

The Athenian believes that these impious beliefs threaten to undermine the political and ethical foundation of the city. Because of this, the lawgiver must attempt to persuade the citizens to abandon these false beliefs. If citizens refuse, they must be punished.

a. Atheism

Clinias is surprised that atheists exist. This is because he thinks that it is well agreed by Greek and non-Greeks that certain visible celestial bodies are gods (885e). The Athenian takes Clinias to be too dismissive of atheists, attributing their belief to a lack of self-control and desire for pleasure (886a-b). The Athenian explains that the cause of atheism is not a lack of self-control, but, rather, a materialistic cosmology (888e-890a). Atheists believe that the origins of the cosmos are basic elemental bodies randomly interacting with each other via an unintelligent process. Craft, which is an intelligent process, only comes into effect later once humans are created. There are two types of craft. First, there are those that cooperate with natural processes and are useful such as farming. Second, there are those that do not cooperate with natural processes and are useless such as law and religion. Hence, Atheists hold that the cosmos is directed via blind random chance and things like religion and law are products of useless crafts.

The Athenian responds by defending an alternative cosmology, which reverses the priority of soul and matter. Readers should be warned that the argument is obscure, difficult, and probably invalid; let this merely serve as a sketch of the main moves in it. The Athenian begins by explaining that there are two types of motions. On the one hand, there is “transmitted motion,” which moves other things, but cannot move unless another motion moves it. On the other hand, there is “self-motion,” which moves itself as well as other things (894b-c). The first motion cannot be a transmitted motion or else there would have to be an infinite series of transmitted motion (894e). Additionally, imagine, for instance, that there was a complete rest, the only thing that could initiate motion again would be self-motion (895a-b). Thus, the first motion must be self-motion (895c).

Having established that the first cause is self-motion, the Athenian examines the nature of self-motion. He argues that a thing that moves itself must be said to be alive and whatever has a soul is alive (895c). In fact, the definition of soul is motion capable of moving itself (895e-896a). From this he concludes that soul is the first source of movement and change in everything and is prior to material things (896c-d). The Athenian asserts that if soul is prior to material bodies, then the attributes of soul (such as true belief and calculation) are also prior to material things (896d). Since soul is the cause of all things, it follows that it is the cause of both good and bad (896d). The Athenian concludes that since the soul dwells in and governs all moving things, it must govern the universe (896d-e).

The argument is not yet complete, however. At this point, even if the argument is sound, it does not establish that there are gods. At best, it only shows that there is at least one or two souls responsible for the motions in the world. The Athenian must show that the qualities that this self-moving soul possesses are divine and worthy of being called a god. This is what he does next by connecting the rationality of the soul with the divine and virtue (897b-899b).

The argument raises a number of interpretative and philosophical questions. One of the more tantalizing questions concerns Plato’s inclusion of a bad soul which is responsible for evil (896e). What is the nature of this bad soul and why does Plato include it? Most commentators have denied that the bad soul is anything like the devil; some hold it is cosmic evil in the universe generally, while others maintain it is located in humans. The inclusion of this issue is related to the problem of evil. The general worry is that if the world is governed by a rational, powerful, and good god (or gods), what explains the inclusion of evil in the world? Why would a rational, powerful, and good god allow for evil? Plato offers various answers. For example, in the Timaeus (42e-44d), evil is said to come from disorderly movements associated with necessity, in the Theaetetus (176a-b), evil is said to come from mortals, and in the Statesman (269c-270a), evil is said to come from god releasing control. Accordingly, the Laws is unique in that evil is explicitly tied to the soul. How we understand the nature of this evil soul will explain whether the view articulated in the Laws is compatible or incompatible with these other texts.

b. Deism and Traditional Theism

Having taking himself to refute atheism, the Athenian takes on deism and traditional theism. He notes that some youths have come to believe that the gods do not care about human affairs because they have witnessed bad people living good lives (899d-900b). The Athenian responds to this charge by arguing that the gods know everything, are all powerful, and are supremely good (901d-e). Now if the gods could neglect humans it would be through ignorance, lack of power, or vice. However, because the gods clearly are not like this, the gods must care about the affairs of humans (901e-903a).

However, the Athenian recognizes that not everyone will be moved by this argument and offers a myth that he hopes will persuade doubters (903b-905d). The myth declares that each part of the cosmos was put together with a mind towards the well-being of the whole cosmos and not any single part. Humans go wrong in thinking that the cosmos is created for them; in reality, humans are created for the good of the cosmos. After this, the Athenian describes a process of reincarnation in which good souls are transferred to better bodies and bad souls to worse bodies. Thus, the unjust will wind up with bad lives and the just will wind up with good lives in the end.

The first part of this myth is important for what it teaches us about Plato’s ethical theory. Ancient ethical theories are often criticized as being too egoistic; that is, they overly focus on the happiness of the individual and not on the contribution to the happiness of others. However, this myth reveals that, at least for Plato in the Laws, this is inaccurate. The myth moves individuals away from their own selfish concerns to the good of everyone generally.

After this, the Athenian swiftly dismisses traditional theism. He maintains that the gods are rulers since they manage the heavens (905e). But what type of earthly rulers do the gods resemble? If traditional theism were true, the gods would resemble petty and greedy rulers (906a-e). But this is an absurd conception of the gods, who are the greatest of all things (907b). Hence, traditional theism must be wrong.

Setting aside issues of how to understand Plato’s theology in the Laws, there is the general question of why Plato thinks impiety will undermine the political system of Magnesia. It is easy enough to see why the deist and traditional theist pose a threat. If the gods are indifferent to human affairs or can be persuaded, then either the gods do not care about citizens disobeying the law or they can be bribed out of caring. It is less clear why the Athenian is concerned about atheists, however. Although he thinks that cultural relativism is a consequence of the atheist’s cosmological views, he admits that not all atheists are vicious and some are good (908b-c). Whatever the answer is, it is clear that Plato thinks that belief in god is in some way tied to thinking that morality is objective. This is a surprising stance in light of the claims put forth in the Euthyphro in which it is argued that ethical truths do not depend on the gods. These two texts are not necessarily inconsistent with each other; nonetheless, there is clearly a tension that requires explanation (see Divine Command Theory).

13. Book 11 and 12

a. Laws

Book 11 and the beginning of 12 discuss various laws, which only have a loose relation to each other. Most of this section is relatively self-explanatory and does not warrant additional comment. This section addresses: property law (913a-915c), commercial law (915d-922a), family law (922a-932d), and miscellaneous laws (932e-960c). Within the discussion of miscellaneous laws, the Athenian discusses an important office, “the scrutineers” (12.945b-948b). The function of scrutineers is to audit the officials of the city and to punish them when necessary. The scrutineers play an essential role in the system of checks and balances in Magnesia. But what ensures that the scrutineers themselves are not corrupt? To ensure that the scrutineers are not themselves corrupt, they must be citizens with proven reputation for good character and capable of approaching matters impartially. However, if an official feels they are being unfairly treated by a scrutineer, they can accuse the scrutineers and a trial will be held to determine the truth.

b. Nocturnal Council

The Laws ends with a discussion of the “nocturnal council,” so named because they meet daily from dawn until sunrise (951c-952d, 961a-968e). The nocturnal council is an elite group of elderly citizens, who have proven their worth by winning honors and have traveled abroad to learn from other states. The nocturnal council plays three roles in the city. First, they will be in charge of supplementing and revising the law in light of changing circumstances, while still keeping with the original spirit of the law. Second, the nocturnal council will study the ethical principles underlying the law. This involves studying the nature of virtue itself, discovering the ways in which the individual virtues of moderation, courage, wisdom and justice are really one Virtue. In addition, members of the nocturnal council will study cosmology and theology. Third, they will explore how these philosophical and theological ideas can be applied to the law. They are to ensure that, as far as possible, the law is in harmony with the philosophical principles they have learned.

The nocturnal council will bring to mind the Republic’s philosopher rulers in charge of the Callipolis. How similar they are depends on what kind of authority is granted to the nocturnal council. In the Callipolis, the philosopher rulers have absolute power, but it is far from clear whether this is the case for the nocturnal council. Indeed, it is a subject of much dispute. The difficulty stems from the fact that a few passages suggest that the nocturnal council will be entrusted with unrestricted power (7.818c, 12.968c, 12.969b). That being said, much of the Laws issues warnings about unrestricted power (see especially 3.691a-d, 4.713c, 9.875a-b); thus, it would be strange for the book to end with a renunciation of this thesis.

14. References and Further Reading

a. Standard Greek Texts

  • Burnet, J. (ed.), Platonis Opera. Vol. 5. (Oxford: Oxford Classical Texts, 1907).
  • Des Places, É. and Diès, A. (eds. and trans.) 1951-1956. Platon: Oeuvres Complètes. Vols. 11-12. (Budé edn. Paris: Société d’ Édition Les Belles Lettres), 1951-1956).

b. English Translations

  • Bury. R. G. Plato: Laws (Vol. 1 and 2). Loeb Classical Library, Plato Volume 10 and 11. (Cambridge, MA: Harvard University Press) English translation side by side with the Greek text.
  • Pangle, T. The Laws of Plato, translated with Notes and Interpretative Essay. (Chicago: University of Chicago Press, 1980).
    • A more literal translation of the text, matching English words and Greek words with precision.
  • Griffith, T. Plato: The Laws. Cambridge Texts in the History of Political Thought, ed. M. Schofield (Cambridge: Cambridge University Press, 2016)
  • Saunders, T. Plato: The Laws, translated with an Introduction. (London: Penguin Books, 1970).
    • A more stylized translation of the text that aims for readability. In addition, it breaks the text into smaller sections, offering a brief analysis of each.

c. General Discussions and Anthologies

  • Bobonich, C. (ed.), Plato’s ‘Laws’: A Critical Guide. (Cambridge: Cambridge University Press, 2010).
    • An anthology that surveys philosophical debates concerning the Laws. Chapter 1, authored by Malcom Schofield, provides a helpful overview of the Laws.
  • Laks, A. “The Laws” in C. Rowe and M. Schofield, eds., The Cambridge History of Greek and Roman Political Thought. (Cambridge: Cambridge University Press, 1998).
    • A brief article that provides an overview of the Laws with a focus on political thought.
  • Sanday, E. (ed), Plato’s Laws: Force and Truth in Politics. Studies in Continental Thought. (Bloomington: Indiana University Press, 2012).
    • An anthology with chapters dedicated to each book of the Laws.
  • Stalley, R. F. An Introduction in Plato’s Laws. (Indiana: Hackett Publishing, 1983).

d. Culture, Laws, and Context

  • Cohen, D. “The Legal Status and Political Role of Women in Plato’s Laws.” Revue Internationale des Droits de l’Antiquité, 34 (1987): 27-40.
    • An optimistic assessment of the role of women in the Laws.
  • Morrow, G. Plato’s Cretan City: An Historical Interpretation of the Laws. (Princeton: Princeton University Press, 1960)
    • Details the various religious and political policies in the Laws, as well as placing them in a historical and cultural context.
  • Nightingale, A. W. “Plato’s Lawcode in Context: Rule by Written Law in Athens and Magnesia.” Classical Quarterly 49 (1999): 100-122.
    • Discusses the historical and cultural context underlying the laws of Magnesia.
  • Nightingale, A. W. “Writing/Reading a Sacred Text: A Literary Interpretation of Plato’s Laws.” Classical Philology 88 (1993): 279-300.
    • Offers a literary interpretation of the Laws.
  • Okin, Susan M. “Philosopher Queens and Private Lives: Plato on Women and the Family.” Philosophy & Public Affairs 6 (1977): 345-369.
    • Discusses how private property affects gender politics in Plato’s philosophy. Okin argues that Plato’s reintroduction of private property in the Laws results in more traditional roles for women than in the Republic.
  • Peponi, A-E (ed.). Performance and Culture in Plato’s Laws. (New York: Cambridge University Press, 2013).
    • Anthology that focuses on the culture and music in Plato’s Law.
  • Reid, J. “The Offices of Magnesia.” Polis 37 (2020): 567-589.
  • Saunders, T. J. Plato’s Penal Code. (Oxford: Oxford University Press, 1991).

e. The Preludes

  • Annas, J. Virtue and Law in Plato and Beyond. (New York: Oxford University Press, 2017).
  • Baima, N. R. and T. Paytas. Plato’s Pragmatism: Rethinking the Relationship between Ethics and Epistemology. (New York: Routledge, 2021).
    • Chapter 2 argues that the persuasion in the Laws is sometimes rational and truthful, and other times non-rational and deceptive.
  • Buccioni, E. “Revisiting the Controversial Nature of Persuasion in Plato’s Laws. Polis 24 (2007): 262-283.
    • Defends a middle reading of the preludes, which compares the use of rhetoric in the Laws to that of the Phaedrus.
  • Bobonich, C. “Persuasion, Compulsion and Freedom in Plato’s Laws.” Classical Quarterly 41 (1991): 365-387.
    • Defends the rational interpretation of the preludes.
  • Laks, A. “Legislation and Demiurgy: On the Relationship between Plato’s Republic and Laws.” Classical Antiquity 9 (1990): 209-229
    • Defends a middle reading of the preludes, according to which the preludes offer an ideal of law, but because of the psychological limitations of the citizens, the actual preludes involves are non-rational.
  • Morrow, G. “Plato’s Conception of Persuasion.” Philosophical Review 62 (1953): 234-250.
    • Defends a non-rational interpretation of persuasion.
  • Stalley, R. “Persuasion in Plato’s Laws.” History of Political Thought 15 (1983): 157-177.
    • Defends a non-rational interpretation of persuasion.
  • Williams, D. L. “Plato’s Noble Lie: From Kallipolis to Magnesia.” History of Political Thought 34 (2013): 363-392.
    • Argues that there is less political deception in Magnesia than in the Callipolis.

f. Ethics, Moral Psychology, and Political Thought

  • Barker, E. Greek Political Theory: Plato and his Predecessors. (London: Methuen, 1960).
    • A classic study of Plato’s political thought.
  • Belfiore, E. “Wine and Catharsis of the Emotions in Plato’s Laws. Classical Quarterly 35 (1992): 349-361.
    • Compares the moral psychology advanced in the Republic to that of the Laws. Argues that the moral psychology in the Laws shares commonalities with Aristotle’s view of the effects of poetry.
  • Bobonich, C. Plato’s Utopia Recast: His Later Ethics and Poltics. (Oxford: Oxford University Press, 2002).
    • Examines Plato’s moral psychology from the Phaedo to the Laws and concludes that Magnesia is Plato’s new utopia.
  • Bobonich, C. “Akrasia and Agency in Plato’s Laws and Republic.” Archiv für der Philosophie 76 (1994): 3-36.
    • Argues that Plato does allow for weakness of will in the Laws.
  • Klosko, G. “The Nocturnal Council in Plato’s Laws.” Political Studies 36 (1988): 74-88.
  • Klosko, G. The Development of Plato’s Political Theory. (London, Methuen, 1986).
  • Meyer, S. S. Plato: The Laws 1 & 2. Translated with an Introduction and Commentary. (Oxford: Oxford University Press, 2015).
  • Samaras, T. Plato on Democracy. (New York: Peter Lang Publishing, 2002)
    • Part three discusses Plato’s political thought in the Laws.
  • Sassi, M. “The Self, the Soul, and the Individual in the City of the Laws.” Oxford Studies in Ancient Philosophy 35 (2008): 125-148.
    • Discusses the moral psychology in the Laws.
  • Saunders, T. J. “The Socratic Paradoxes in Plato’s Laws.” Hermes 96 (1968): 421-434.
    • An influential article on voluntary wrongdoing in the Laws.
  • Weiss, R. The Socratic Paradox and its Enemies. (Chicago: University of Chicago, 2006).
    • Chapter 9 discusses Plato’s distinction between injury and injustice and relates it to the idea that justice is beautiful and injustice is shameful.
  • Wilburn, J. “Tripartition and the Causes of Criminal Behavior in Laws 9.” Ancient Philosophy 33 (2013): 111-134.
    • Discusses Plato’s account of moral psychology and its relation to Book 9.
  • Wilburn, J. “Akrasia and Self-Rule in Plato’s Laws.” Oxford Studies in Ancient Philosophy 43 (2012): 25-33.
    • Presents an alternative reading of the puppet metaphor according to which it does not support weakness of will.

g. Theology

  • Carone, G. R. Plato’s Cosmology and its Ethical Dimensions. (Cambridge: Cambridge University Press, 2005).
    • Chapter 8 discusses Plato’s account of cosmic evil in Laws 10.
  • Mayhew, R. Plato: Laws 10. (Oxford: Oxford University Press, 2008).
    • Offers a line by line commentary and discussion of Book 10.
  • Mohr, R. God and Forms in Plato. (Las Vegas: Parmenides, 2006).
    • Chapters 8 and 11 focus on theology in the Laws.
  • Powers, N. “Plato’s Cure for Impiety in Laws 10.” Ancient Philosophy 34 (2014): 47-63.
    • Discusses how the context in which the Athenian presents his theology constrains the account given.
  • Solmsen, F. Plato’s Theology. (Ithaca: Cornell University Press, 1942).
  • Trelawny-Cassity, L. “On the Foundation of Theology in Plato’s Laws,” Epoché: A Journal for the History of Philosophy 18 (2014): 325-49.
    • Discusses Plato’s cosmology and theology in the Laws by connecting it to Plato’s methodology and ideas explored in the Phaedo, Statesman, Philebus, and Timaeus.

Author Information

Nicholas R. Baima
Email: nichbaima@gmail.com
Florida Atlantic University
U. S. A.

Arthur Prior: Logic

A. N. PriorArthur Norman Prior (1914-69) was a logician and philosopher from New Zealand who contributed crucially to the development of ‘non-standard’ logics, especially of the modal variety. His greatest achievement was the invention of modern temporal logic, worked out in close connection with modal logic. However, his work in logic had a much broader scope. He was also the founder of hybrid logic, and he made important contributions to deontic logic, modal logic, the theory of quantification, the nature of propositions and the history of logic. In addition, he discussed questions of ethics, free will, and general theology. Prior’s philosophical works comprise about 200 titles. His earliest articles center on philosophical theology and historical studies of Scottish Reformed Theology. This led on to the publication of his first influential work on ethics: Logic and The Basis of Ethics (1949). With the invention of tense-logic in the early 1950s, his focus shifted to investigations into the syntax of tempo-modal logic leading to his seminal Time and Modality (1957), a volume derived from his John Locke Lectures in Oxford in 1956. Furthermore Prior, together with the Irish mathematician and logician C.A. Meredith (1904-76), made important early contributions to the semantics of possible worlds. Prior’s tense-logic provided a strong conceptual framework for problems pertaining to the philosophy of time. In Time and Modality, Prior discussed the philosophical implications of Ruth Barcan’s famous formulae for tense-logic, and in the 1960s he worked on the notion of the present.

The most persistent problem running through Prior’s work is his study of the questions surrounding human freedom and divine foreknowledge, and more general philosophical problems emerging from this classical theological question. His thorough analysis of this problem, with the conceptual tools of tense-logic, received a crucial impetus from his correspondence with the young Saul Kripke, when the latter suggested the semantic tool of branching time to Prior. Prior’s development of two solutions based on branching time for the problem of future contingency, the Peircean and the Ockham solution, was most thoroughly developed in Past, Present and Future (1967), the most important work published by Prior. Characteristically for Prior’s methodological approach, the development of these two solutions were at the same time a development of two new systems of tense logic, and vice versa. One of Prior’s significant contributions to logic was his work on world propositions and instant propositions. In the course of developing these notions he also made one of the earliest formulations of hybrid logic. In Papers on Time and Tense (1968), he presented this idea in a more detailed manner in the context of his four grades of tense-logical involvement.

Table of Contents

  1. Life and Work
  2. Main Trends in Prior’s Philosophical Logic
    1. The Logic of Ethics
    2. The Syntax of Tempo-Modal Logic
    3. Humean Freedom and Divine Foreknowledge
    4. Temporal Logic and Theories of Truth
    5. The Logic of Existence
    6. Four Grades of Tense-Logical Involvement
  3. Conclusion
  4. References and Further Reading

1. Life and Work

Arthur Prior was born in Masterton, New Zealand on December 4th, 1914. He graduated in philosophy in 1937 and worked for a number of years at Canterbury University, Christchurch, from 1952 as a professor. In 1959 he was appointed professor of philosophy at Manchester University, and in 1965 he became a reader in Oxford and a fellow of Balliol College. Prior died on October 6th, 1969 in Trondheim, Norway, while on a lecture tour in Scandinavia.

Arthur Prior’s mother died a few weeks after his birth. His father was a doctor and a medical officer during the First World War, and Arthur was brought up by his aunts and grandparents. Both of his grandfathers were Methodist ministers. It is obvious that Prior’s upbringing in a Christian family formed an important background for his later works in philosophy and logic.

In 1932 Prior went to Otago University at Dunedin. He set out to study medicine, but after a short time he instead went into philosophy and psychology. In 1934 he attended Findlay’s courses on ethics and logic. Through Findlay, Prior became interested in the history of logic. His M.A. thesis was devoted to this subject.

During his first year as a Philosophy student at Otago University, Prior joined the Presbyterian denomination. He attended courses at the Presbyterian Knox Hall with a view to entering the Presbyterian ministry. This intention was never realised, but he was for many years to come a practising member of the Presbyterian community. In particular, he became a very active member of the Student Christian Movement (SCM). Major theological influences on him were Karl Barth, Emil Brunner, and to some extent Søren Kierkegaard (1940). Prior was also a socialist, and his adherence to socialism is especially prevalent in his early articles to the SCM magazine, Open Windows.

His article, “Can Religion be Discussed?”, was published in 1942. At this time, Prior found himself in a crisis of belief. This is evident from a diary entry, dated March 25, 1942 (Prior 2018). What nevertheless motivated Prior to continue his theological studies despite his crisis of faith, was a conviction, that ‘now’ is possible for us to evaluate: That “useful knowledge would grow out of [Prior’s] collection of theological systems, in good time.” After his crisis of faith, it appears that Prior returned to a somewhat more classical Presbyterian position. As late as in 1958 he published a paper that seems to be endorsing this faith, if only vaguely, The good life and religious faith (1958).

In 1943, Arthur Prior married Mary Wilkinson. From 1943 till the end of World War Two, he served in the Royal New Zealand Air Force. In view of Nazism and the World War, Prior had given up his earlier pacifist leanings.

Prior’s first employment at Canterbury University College was in 1946. A vacancy had been made when Karl Popper left. At this time, Prior was still strongly committed to theological studies, and he was working on a book on the history and thought of Scottish (Presbyterian) Theology. Unfortunately, the Priors’ house burned down in March 1949. After the fire, in which some of his drafts perished, he gave up the project on Scottish Theology. His main intellectual interest from then on veered toward philosophy, ethics, and logic.

Prior’s first book, Logic and the Basis of Ethics, was published in 1949 by Oxford University Press. During 1950 and 1951 Prior wrote a manuscript for a book with the working title The Craft of Logic. This book was, however, never published as a whole, but in 1976 P.T. Geach and A.J.P. Kenny edited parts of it, which were published as The Doctrine of Propositions and Terms. In the first chapter of the book, “Propositions and Sentences”, Prior argued that according to the ancient as well as the medieval view a proposition may be true at one time and false at another (Prior 1976a, p. 38).

In the beginning of 1953 Clarendon Press accepted The Craft of Logic for publication on the condition that Prior made a number of rather substantial changes. As a result, Prior wrote a completely different book, Formal Logic, which was published in 1955.

Benson Mates’ short article, ‘Diodorean Implication’ (1949) made Prior even more aware of the interesting relation between time and logic. Prior realised that it might be possible to relate Diodorus’ ideas to contemporary works on modality by developing a calculus which included temporal operators analogous to the operators of modal logic.

Around 1953, Prior began to work on the development of a formal calculus of tenses. Mary Prior has described the first occurrence of this idea: “I remember his waking me one night, coming and sitting on my bed, and reading a footnote from John Findlay’s article on Time, and saying he thought one could make a formalised tense logic” (Hasle, 2003). This must have been some time in 1953. The footnote in (Findlay 1941), which Prior studied that night, was the following:

And our conventions with regard to tenses are so well worked out that we have practically the materials in them for a formal calculus\ldots. The calculus of tenses should have been included in the modern development of modal logics. It includes such obvious propositions as that

x present = (x present) present;
x future = (x future) present = (x present) future;
also such comparatively recondite propositions as that
(x).(x past) future; i.e. all events, past and future will be past.

Findlay’s considerations on the relation between time and logic in this footnote were not very elaborated, but it apparently gave the final impulse to Prior’s idea of developing a formal calculus which would capture this relation in detail. From 1953 until his death in 1969 the development of tense logic was his main project. With his many articles and books on questions in tense logic he presented a very extensive and thorough corpus, which still forms the basis of tense logic as a discipline.

Prior was invited to give the ‘John Locke Lectures’ in Oxford. In 1956 the Priors went to Oxford for this purpose. This gave him an excellent opportunity to present his new findings regarding time and modality. Among the participants were John Lemmon, Ivo Thomas, and Peter Geach (Kenny 1970 p. 337). The lectures were later published as the book Time and Modality (1957a). It was this work which made Prior internationally known. In Oxford, Prior also made some important and lasting friendships and professional associations, especially with John Lemmon, Ivo Thomas, P.T. Geach, Elizabeth Anscombe, Carew Meredith, David Meredith, and C. Lejewski.

Prior had a strong belief in the value of formal logic. On the other hand, he also emphasised that logic has to do with real life. He wanted a logic that would take full advantage of formal methods, but which would also be sensitive to the reality of human experience.

Prior adopted Łukasiewicz’s so-called Polish notation, in which the conjunction is represented as Kpq. He emphasised that this prefix notation “obviates the necessity of using brackets”, so that “no special rules about bracketing and rebracketing need to be included among the rules for proving one formula from another” (1955a, p. 6). Polish notation was rather common during Prior’s lifetime. Apart from its theoretical appeal it had the significant practical advantage that proofs, among other things, could be written directly on a typewriter. Nevertheless, there is no doubt that Prior also was quite convinced about the syntactical superiority of Polish notation, for which he campaigned throughout his career as a logician.

Prior not only preferred to use Polish notion for his works within symbolic logic, in fact he highly valued various parts of Polish logic, and he corresponded with several Polish logicians. In 1961 he even went to Poland to give lecture (see 1962) and to take part in the 1961 ‘International Colloquium on Methodology of Science’, Warsaw. In particular, Prior found Łukasiewicz’s three-valued logic very interesting (1920, 1930), and he carried out some careful studies of this logic (see Prior 1952).

Prior was very interested in the history of logic not only as a subject in its own right, but he also saw the works of ancient and medieval logicians as a significant contribution to the contemporary development of logic. From 1952 to 1955 he had seven articles published on the history of logic. Four of these were concerned with Medieval logic and one with Diodorean logic. His interest in the history of logic is also evident in Formal Logic. Prior was particularly interested in Aristotle, Diodorus, and the Scholastics, but his interest also extended to more recent logicians such as Boole and Peirce, the latter of which he called “the greatest of all symbolic logicians” (1957b).

After the publication of Time and Modality, Prior received a number of important and interesting letters from various logicians. One of the logicians who wrote was Saul Kripke (Ploug and Øhrstrøm 2012). In two letters to Prior in September and October 1958, Kripke put forth the idea of branching time. During the following years Prior further developed this idea.

In 1959 Arthur Prior took up a professorship at the University of Manchester. At that time, he left the Presbyterian Church without joining any other denomination. One of his main reasons (compare Hasle 2012) is most likely the tension Prior saw between the ideas of predestination and free will, although the recent discovery of Prior’s diary from his crisis of faith, gives us other reasons in addition (available in Prior 2018). Although Prior, throughout his career, continued to treasure his theological library, and to study problems related to theology (see Kenny 1970 p. 326), we find in his dairy an explicit vision for such a commitment, one that doesn’t presuppose a commitment to personal beliefs.

In the early winter of 1962, Prior was visiting professor at the University of Chicago. During this stay he made some thorough studies of parts of Charles Sanders Peirce’s logic.

From September 1965 to January 1966 Prior was a visiting Flint professor at the University of California. During his stay in California, Prior made some important professional associations, especially with Dana Scott, Donald Davidson, David Lewis, and Richard Montague. In this period Past, Present, and Future (1967)—often regard as Prior’s most important book—was drafted. Apparently, Prior’s California lectures contributed significantly to the flourishing of logic there at that time, and especially it seems to have sparked a great interest in tense logic in the USA.

The Priors stayed in Manchester for seven years. In 1966 Anthony Kenny recommended Prior for a fellowship at Balliol College. Prior was offered this position. He accepted, and the family moved to Oxford, where Prior worked until his death in 1969.

During the 1960s, Prior made some very important contributions to the understanding of the concept of time. He demonstrated that temporal logic can in fact be a very powerful tool in philosophical analysis—also in relation to many of the questions to which his earlier studies in theology and ethics had given rise. He kept his interest in theology and ethics throughout his life, and through his studies in time he managed to reignite the discussions of the relationship between God’s foreknowledge and humanity’s freedom.

2. Main Trends in Prior’s Philosophical Logic

Prior’s work on philosophical logic includes an analytical and modern component as well as an historical component. Nevertheless, there is no sharp distinction between Prior’s analytical and historical concerns on one hand and his work as a formal logician on the other.

The following sections concentrate on these main trends in Prior’s philosophical logic: (a) The Logic of Ethics; (b) The Syntax of Tempo-Modal Logic; (c) Human Freedom and Divine Foreknowledge; (d) Temporal Logic and Theories of Truth; (e) The Logic of Existence; and (f) Four Grades of Tense-Logical Involvement.

a. The Logic of Ethics

Prior’s first major contribution to philosophy was his work on the logic of ethics. The work culminated with the publication of Logic and the Basis of Ethics (1949), which rather quickly gained him recognition as a philosopher. The book displays Prior’s in-depth knowledge of Scottish philosophy, which forms the background of the discussion on the autonomy of ethics. In 1942, Prior had discussed this topic in “Faith, Unbelief and Evil” (available in Prior 2018), where the context is the question of whether God is a necessary foundation for morality. In Logic and the Basis of Ethics, Prior takes up the same argument, and argues that one cannot logically maintain that any foundation is better than another, with regard to ethics. G.E. Moore had criticised the deduction of ‘ought’ from ‘is’ (that is, the so-called naturalistic fallacy). He agreed with G.E. Moore, but Prior maintained that it would be a larger error to deny the autonomy of ethics (1949, p. 107).

Being a logician, Prior wanted to demonstrate that logic can be used in the study of ethics as well as in the study of nature. Prior pointed out that the logic of ethics is not a special kind of logic, nor a special branch of logic, but an application of it. (1949, p. ix) He maintained that categorical obligations must lie on particular persons at particular moments.

For Prior as for many others working with ethics, the notion of duty is rather basic. In the 1950s, G.E. Moore’s definition of ‘duty’ in Principia Ethica was very influential. In this work, Moore repeatedly affirms that our duty is that action which, of all the alternatives open to us, will have the best total consequences. In his paper, ‘The Consequences of Actions’ (2003, p. 65-72), which was originally presented at the Joint Session of the Mind Association and the Aristotelian Society at Aberystwyth in 1956, Prior argued that this definition is very problematic. In many cases it is not very clear what should be accepted as a consequence of a given act or behaviour. For this reason, it may turn out to be very difficult to find out with any certainty what our duty is, given Moore’s definition of ‘duty’.

However, Prior’s criticism of Moore’s definition goes much deeper than to an analysis of the idea of a consequence. In fact, Prior argued that there is a logical impossibility in there being such a thing as a duty in Moore’s sense. Supposing that determinism is not true, Prior considered “a number of alternative actions which we could perform on a given occasion”, and he argued that none of these actions can be said to have any “total consequences”, since “the total future state of the world depends on how these others choose as well as on how the given person chooses….” (2003, p. 65). This simply means that there is no such totality. For this reason, Prior rejects Moore’s idea of duty as being incoherent.

According to Prior, the only way out of this problem that is open for the utilitarian, involves another definition of ‘duty’. Following this alternative definition, duty is to do what will probably have the best total consequences of all the actions open to us. This solution is however somewhat problematic, conceptually speaking.

Prior’s criticism of utilitarian theory should also be seen in the light of the fact that he wanted ethics to be treated theoretically in another way, that is, in terms of deontic logic involving operators corresponding to obligation and permissibility.

The term ‘deontic logic’ had been suggested by Henrik von Wright (1951). In his Formal Logic (1955a), Prior defended von Wright’s view that the logic of obligation can be handled very much like the logic of necessity. He was, however, aware of the fact that many philosophers would resist this, and strongly insist that moral philosophy has very little to do with logical deduction. In the paper, “The Logic of Obligation and The Obligation of the Logician” (available in Prior 2018), Prior argued that although ethics cannot be deduced or derived from logic, ethical argumentation has to live up to certain formal standards, which are worth studying for their own sake.

Corresponding to the usual Aristotelian logical square for syllogisms, Prior constructed the following diagram explaining the mutual relations between some basic notions in deontic reasoning:

In his deontic logic, Prior used the operator P for ‘it is permissible that (such-and-such an act be done)’. From this operator we may construct the operator

    \[O =~P~\]

corresponding to ‘it is obligatory that \ldots’. A deontic logic can be constructed by adding the following two axioms to propositional logic:

AD1.
AD2.
OaPa
P(ab)(PaPb)

 

together with the rule

RD1:
\alpha\beta \rightarrowP \alphaP \beta

 

Prior demonstrated that in this axiomatic system it is possible to derive the following rule:

RD2:
\alpha \rightarrowP \alpha

 

(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
\alpha
O \supset Pa
├ ~P~a \supset Pa
P~a \lor Pa
P(~a \lor a)
\alpha≡(~a \lor a)
P \alphaP(~a \lor a)
P \alpha
[assumption]
[AD1]
[from 2 and def.]
[from 3 and propositional logic]
[from 4 and AD2]
[from 1 and propositional logic]
[from 6 and RD1]
[from 1 and 7]

 

    \[\hspace{0.2cm}\]

RD2 means that if \alpha expresses a logical law, then it is a law that \alpha is permissible. Prior renders this more freely as ‘what I cannot but do, I am permitted to do’. This also amounts to ‘what I cannot but omit, I am permitted to omit’ and consequently also to the Kantian principle ‘what I ought, I can’. A number of other interesting theorems can be proved in Prior’s system, for instance:

(OaO(ab))Ob
(If doing what we ought commits us to doing something else, then we ought to do this something else.)

 

~(PaO(ab)
(Doing what is not permitted commits us to doing anything whatever.)

 
 
 

The latter example corresponds to one of the paradoxes of strict implication.

In an appendix in his Time and Modality (1957a), Prior discussed a different approach to deontic logic based on an idea from Alan Ross Anderson. According to this idea, a deontic logic can be established from modal logic by the addition of a propositional constant \frak{R} corresponding to the reading ‘the world will be worse off’, ‘punishment ought to follow’ or something of that sort. Given a modal propositional logic with a possibility operator, \Diamond, and the propositional constant \frak{R}, we may define ‘permissible’ in the following way:

Pa = \Diamond(a \wedge ~\frak{R})

 

In accordance with this definition Oa should be seen as an abbreviation of

\Box (~a\frak{R})

 

where \Box is a necessity operator defined as ~\Diamond~. In short, this means that a is permissible if it is possible that a is the case without ‘the bad thing’ (\frak{R}) being the case. Similarly, a is forbidden if \frak{R} necessarily follows from it.

Using these definitions, (AD2) can be immediately derived in most modal systems. Prior demonstrates that

\Diamond aPa

 

is equivalent to

~\Diamond \frak{R}

 

Since it cannot be accepted that all possible acts are permissible, Alan Ross Anderson suggested the assumption of \Diamond \frak{R}. In fact, he proposed the axiom

\Diamond\frak{R} \wedge \Diamond~\frak{R}

 

which simply states that \frak{R} is contingent. Prior showed that the second part of the axiom is deductively equivalent in most modal systems to AD1, that is, OaPa. He also demonstrated that in most modal systems it is possible to derive the Kantian principle

Oa\Diamond a

 

as well as the principle

(Oa \wedge O(ab))Ob

 

Furthermore, he discussed the question of validity in various systems of more complicated theorems such as

O(Oaa)

 

and as well as the paradoxical

~PaO(ab)

 

Prior wanted to study the logical machinery involved in the theoretical derivation of obligation. He claimed that this study involves

(a) the description of the actual situation, and
(b) relevant general moral rules.

 
 
 
 
Prior stated his fundamental creed regarding deontic logic by claiming that “our true present obligation could be automatically inferred from (a) and (b) if complete knowledge of these were ever attainable” (1949, p. 42).

Prior wanted to present ethical argumentation as an axiomatic system. But in doing so he understood that something extra-logical has to be taken for granted. In his unpublished draft “Logical Criticisms of the Theory Identifying Duty with Self-interest” (available in Prior 2018), which he apparently wrote from a lecture on ethics in 1947, he quoted C.S. Lewis, in The Abolition of Man: “If nothing is self-evident, nothing can be proved. Similarly, if nothing is obligatory for its own sake, nothing is obligatory at all” (1943, p. 21; Prior’s emphasis). Similarly, Prior accepted the idea of an extra-logical and axiomatic foundation for ethics (deontic logic), and he rejected the idea of reducing ethics to something else.

It is evident that Prior’s long-term ambition was to incorporate the logic of ethics into a broader context of time and modality. Unfortunately, he was never able to pursue this goal in detail, but he certainly managed to establish the broader context of time and modality into which the logic of obligation has to fit.

b. The Syntax of Tempo-Modal Logic

Prior revived the medieval attempt at formulating a temporal logic for natural language. In a short but thought-provoking sketch of the history of logic with a special view to tense-logic, Prior argued that the central tenets of medieval logic with respect to time and tense can be summarised in the following way:

(i) tense distinctions are a proper subject of logical reflection,
(ii) what is true at one time is in many cases false at another time, and vice versa. (1957a, p. 104)

 
 
 
 
Prior observed that ancient and medieval logicians took these assumptions for granted, but that they were eventually denied (or simply ignored) after the Renaissance. In fact, the waning of tense logic began with a gradual loss of interest in temporal structures, that is, it was item (i) which was first abandoned by the different schools of logic, and (ii) came to be rejected only afterwards.

Prior can be said to have realised the possibility of (re)formulating a logic based on these old assumptions. Major sources for him were Łukasiewicz’ discussion of future contingents (1920), which was inspired by Aristotle’s De Interpretatione, and the Diodorean Master Argument, which he came to study via a paper by Benson Mates on Diodorean Implication (1949).

Prior believed that the problems of future contingents can be analysed and much better understood by the use of a temporal logic which includes the operator, F(n)—“in n time units it will be the case that \ldots”. In his earliest attempt (1953) to deal with these problems, he used Łukasiewicz’s three-valued logic, in which the third value, ½, was supposed to represent ‘indeterminate’. He suggested that contingent statements such as the Aristotelian ‘There is a sea-fight tomorrow’ are contingent statements of the form, (1)p, are all indeterminate.

However, Prior realised that there is a serious problem with this approach. In fact, the usual truth-functional technique breaks down for these theories. For instance, if F(1)p and ~F(1)p are both ‘indeterminate’ (½), it is very hard to explain how statements like the conjunction F(1)p \wedge ~F(1)p and the disjunction F(1)p \vee ~F(1)p could come out as anything else than ‘indeterminate’, when treated according to Łukasiewicz’s three-valued logic (Prior 1967, p. 135). He therefore decided to stick to a bivalent tense logic.

Prior’s early work on the logic of time also led to the paper Diodoran Modalities (1955c) (later spelling: ‘Diodorean’). In fact, his very first proper study in tense logic was an analysis of an ancient argument in favour of determinism, the Master Argument of Diodorus (1955b). This argument was constructed by Diodorus Cronus (ca. 340-280 B.C.), who was a philosopher of the Megarian school, and who achieved wide fame as a logician and a formulator of philosophical paradoxes (Sedley 1977). Unfortunately, only the premises and the conclusion of the Master Argument are known. We know almost nothing about the way in which Diodorus used his premises in order to reach the conclusion. It is, however, known that the Master Argument was presented as a trilemma. According to Epictetus, Diodorus argued that the following three propositions cannot all be true (Mates 1961, p. 38):

(D1)
(D2)
(D3)
Every proposition true about the past is necessary.
An impossible proposition cannot follow from (or after) a possible one.
There is a proposition which is possible, but which neither is true nor will be true.

 

Diodorus used this incompatibility combined with the plausibility of (D1) and (D2) to justify that (D3) is false. Assuming (D1) and (D2) he went on to define possibility and necessity as follows:

(D\Diamond)
(D\Box)
The possible is that which either is or will be true.
The necessary is that which, being true, will not be false.

 

The reconstruction of the Master Argument certainly constitutes a genuine problem within the history of logic. However, it should be noted that the argument has been studied for reasons other than historical. First of all, the Master Argument has been read as an argument for determinism. Second, the Master Argument can be regarded as an attempt to clarify the conceptual relations between time and modality.

Prior’s reconstruction (1967, p. 32 ff.) of the Master Argument is based on the assumption that the statements in question are in fact propositional functions whose truth-values can vary from time to time. Prior uses his tense-operators in the reconstruction:

P: “it has been the case that \ldots
F: “it is going to be the case that \ldots
H (= ~P~): “it has always been the case that \ldots
G (= ~F~): “it will always be the case that \ldots

 

With these assumptions it is possible to restate the reconstruction problem. Using symbols, (D1-3) can be formulated in the following way:

(D1’)
(D2’)
(D3’)
Pq\Box Pq
((p \rightarrow q) \wedge \Diamond p)\Diamond q
(\exists r) (\Diamond r \wedge ~r \wedge ~Fr)

 

where \rightarrow is the strict implication defined as

p \rightarrow q \equiv \Box (pq)

 

Prior is, however, not able to reconstruct the argument only using (D1), (D2) and (D3). In addition to these, he needs two extra premises. He must assume the theses

(D4)
(D5)
(p \wedge Gp)PGp
\Box (pHFp)

 

Prior’s proof that the three Diodorean premises (D1’, D2’, D3’) are inconsistent given (D4) and (D5) can be summarised as a reductio ad absurdum proof in the following way

(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
\Diamond r \wedge ~r \wedge ~Fr
\Diamond r
\Box (rHFr)
\Diamond HFr
~r \wedge G~r
PG~r
\Box PG~r
~\Diamond HFr
[from D3’]
[from 1]
[from D5]
[from D2, 2 & 3]
[from 1]
[from 5 & D4]
[from 6 & D1]
[from 7; contradicts 4]

 
O. Becker (1960) has shown that the extra premises (D4) and (D5) can be found in the writings of Aristotle, and he claims it seems reasonable to assume that the extra premises were generally accepted in antiquity.

During the 1950s and the 1960s, Prior developed his calculus of tenses into a rather sophisticated formalism. In 1958 he entered into correspondence with Charles Hamblin of The New South Wales University of Technology, Australia. Their correspondence led to important results, especially on implicative relations among tensed propositions. Prior and Hamblin discussed two central issues in tense logic: the number of non-equivalent tenses, and the implicative structure of the tense operators. In 1958, Hamblin suggested a set of axioms with P and F as monadic operators, corresponding to “a simple interpretation in terms of a two-way infinite continuous time-scale”. Hamblin’s axioms are:

Ax1:
Ax2:
Ax3:
Ax4:
Ax5:
F(p \vee q) \equiv (Fp \vee Fq)
~F~pFp
FFp \equiv Fp
FPp \equiv (p \vee Fp \vee Pp)
~F~Pq \equiv (q \vee Pq)

 

Hamblin also assumed three rules of inference:

R1: If A is a thesis, then ~F~A is also a thesis.
R2: If A \equiv B is a thesis, then FA \equiv FB is also a thesis.
R3: If A is a thesis, and A’ is the result of simultaneously replacing each occurrence of F in A by P and each occurrence of P in A by F, then A’ is also a thesis. (A’ is the so-called mirror-image of A.)

 

When these axioms and rules are added to the usual propositional calculus, a number of interesting theorems can be proved. In fact, Hamblin could prove that “there are just 30 distinct tenses”, which can be formed using only P, F and negation.

In 1965, Hamblin and Prior ended up with the following implicative structure for the tense-operators, which according to Hamblin is “a bit like a bird’s nest” (Øhrstrøm and Hasle, p. 178):

This result regarding so-called linear tense-logic was that Prior published his major work, Past, Present and Future (1967), in which he also showed that several other interesting tense logical systems can be established.

c. Humean Freedom and Divine Foreknowledge

Prior was highly interested in the logical relations between the two doctrines of human freedom and divine foreknowledge. His preoccupation with free will was motivated by his struggle with the theological determinism that had drawn him to the Presbyterian Church in the first place.

Prior changed his mind regarding determinism around 1950. At this time, he developed into an adherent of indeterminism, and indeed, of free will. In his own words: “\ldots the future is to some extent, even though it is only to a very small extent, something we can make for ourselves.” (“Some Free Thinking about Time”, Prior 2018)

Prior’s logical studies increasingly led him away from what he regarded as indispensable parts of the Christian faith. His main publication on the logical problems related to the doctrines of human freedom and divine foreknowledge was the paper, Formalities of Omniscience in 1962 (reprinted in Prior 2003, p. 39-58). In this article he discussed theological determinism in terms of temporal logic, and, as pointed out by Hasker (1989), he launched the modern discussion on divine foreknowledge and human freedom. The paper examines the idea of omniscience, especially in the form of the statement “God is omniscient”, and some putative consequences of it, such as:

“It is, always has been, and always will be the case that for all p, if p then God knows that p”, and:
“For all p, if (it is the case that) p, God has always known that it would be the case that p”.

Prior discusses various interpretations of such statements, especially with reference to St. Thomas Aquinas. He argues against Thomas’ view that God’s knowledge is in some way beyond time, but otherwise he consents to most of what Thomas had said about tense-logical reasoning. According to Prior’s interpretation of Thomas’ philosophy, Thomas would even agree on the rejection of the “Diodorean” assumption, (D5).

On the basis of his studies of medieval logic, Prior developed an argument regarding the contingent future and divine foreknowledge. In this argument a new operator is needed:

D: “God knows that \ldots

In “The Formalities of Omniscience” (2003, p. 39-58) as well as other writings, Prior presented several versions of the argument. The most interesting version can be rephrased using the following 5 principles:

(P1)
(P2)
(P3)
(P4)
(P5)
F(y)AP(x)DF(x)F(y)A
\Box (P(x)DF(x)AA)
P(x)A\Box P(x)A
(\Box (AB) \wedge \Box A)\Box B
F(x)A \vee F(x)~A
(Divine foreknowledge)
(Infallibility of God’s knowledge)
(The fixity of the past)
(Basic assumption about modality)
(Principle of the excluded middle)

 
Here A and B represent arbitrary well-formed statements within the logic. If q stands for some atomic statement, then F(y)q is a statement about the contingent future.

Principle (P1) states that if something is going to happen, God has already known for some time that it is going to happen. According to (P2), if it was the case x time units ago that God knew that A would be the case x time units later, then it necessarily follows that A is the case now. The principle (P3) means that if A was the case x time units ago, then it is necessary that it was the case x time units ago. (P4) is a basic assumption in modal logic, and (P5) which is about the determinateness of the future states that either A is going to be the case in x time units or ~A is going to be the case in x time units.

The argument proceeds in two phases: first from divine foreknowledge to necessity of the future, and from that argument to the conclusion that there can be no real human freedom of choice. Formally, the argument goes as follows:

(1)
(2)
(3)
(4)
(5)
F(y)q
P(x)DF(x)F(y)q
\Box P(x)DF(x)F(y)q
\Box (P(x)DF(x)F(y)qF(y)q)
\Box F(y)q
[assumption]
[from 1 & P1]
[from 2 & P3]
[from P2]
[from 3, 4, P4]

 
 
 
 
 
 
In this way it is proved that

(6)
F(y)q\Box F(y)q

 

and, similarly, it is possible to prove

(7)
F(y)~q\Box F(y)~q

 

The second part of the main proof is carried out in the following way:

(8)
(9)
F(y)q \vee F(y)~q
\Box F(y)q \vee \Box F(y)~q
[from P5]
[from 6, 7, 8]

 
 
 
 
Here (9) is conceived as a denial of the dogma of human freedom. Therefore, if one wants to save this dogma (and escape fatalism) at least one of the above principles (P1-5) has to be rejected. Prior realized that this can be obtained in several ways. He argued, however, that two of them are particularly important, that is, the denials of (P3) and (P5). The solution based in the denial of (P3) is called the Ockhamistic solution. According to this view not all propositions formulated in the past tense should be treated as statements properly about the past, and (P3) should only be accepted if P(x)A is a statement about the proper past. This would rule out the use of (P3) to deduce (3) from (2), since P(x)DF(x)F(y)q is not a statement about the proper past.

Prior’s own position was that (P3) should in fact be accepted and (P5) should be rejected. His view on future contingents was that their truth value cannot be known now, not even by God, that is, there are no true statements about future contingents. On this view, the statement ‘There will be a sea-battle tomorrow’ cannot be true today, and the same is the case for the statement ‘There will be no sea-battle tomorrow’. Prior would maintain that both of these statements are in fact false today, and suggested the following condition of truth with respect to future statements:

\ldots nothing can be said to be truly ‘going-to-happen’ (futurum) until it is so ‘present in its causes’ as to be beyond stopping; until that happens neither ‘It will be the case that p’ nor ‘It will not be the case that p’ is strictly speaking true. (2003, p. 52)

Prior held that the proposition F(x)p can only be true if it is ‘present in its causes’. The same can be said about F(x)~p. According to his view, propositions about the contingent future, are false or not well-formed. In consequence, the proposition F(x)p \vee F(x)~p is false according to this view, if F(x)p is a statement about the contingent future.

Prior believed that St. Thomas Aquinas also held these ideas. Prior pointed out that this position regarding the contingent future is also quite essential in Peirce’s philosophy. In fact, Prior called the way of answering the problems of arguments like the one presented above the Peircean solution. This view means that he had to reject qP(x)F(x)q as a thesis. If q is true now, but not something which had to be true (by necessity), then the Peircean solution implies that F(x)q was false x time units ago, for some x.

d. Temporal Logic and Theories of Truth

According to Peter Geach, Prior regarded his own research into the logic of ordinary language constructions as a continuation of the medieval tradition (Geach 1970, p. 188). In doing so, Prior sought an account of truth for propositions in modal logic which was more in line with intensional logic. This led him to important contributions to the logic of modality. He gave the very first formulation of the answer which is now normally given, that is, the answer in terms of accessibility between possible worlds. In fact, already in 1951, he had suggested dealing with modal logic using ‘state-descriptions’ (see Copeland 1996, p. 11). A few years later he showed how tense logic can be studied using instants as state-descriptions, which are ordered by an earlier-later relation. Together with Carew Meredith, these ideas were later further developed, and they thereby led to the significant invention of possible world semantics (see Copeland 1996, p. 8 ff.). In 1956, Prior and Meredith wrote a brief joint paper entitled Interpretations of Different Modal Logics in the ‘Property Calculus’ (1956). This paper, which was circulated in mimeograph form, contained the essential elements of possible worlds semantics for propositional modal logic. It seems that Jack Copeland (2002) is right in holding that in this paper a binary relation appeared for the first time as an accessibility-like interpretation of the relation in an explicitly modal context. The authors do not suggest any philosophical explanation of the relation or of the related object. Nevertheless, there can be no doubt that they had a relation between possible worlds in mind. As Jack Copeland has pointed out, Meredith, in a letter to Prior dated 10 October 1956, in fact uses the term ‘possible world’ and Meredith and Prior in Computations and Speculations (Bodleian Library, box 8, p. 119) used the same term. Later Prior wrote:

I remember \ldotsC. A. Meredith remarking in 1956 that he thought the only genuine individuals were ‘worlds’, i.e. propositions expressing total world-states, as in the opening of Wittgenstein’s Tractatus (‘The world is everything that is the case’). [2003, p. 219]

Using the idea of branching time which had been suggested by Saul Kripke in 1958, Prior showed that important differences between some of the systems can be illustrated graphically (Ploug and Øhrstrøm 2012). In fact, Prior discussed three different models of branching time. The main difference between these models has to do with the status of the future. The models fall into a small number of groups, where the basic ideas can be shown in a very intuitive way: consider once again the old Aristotelian example about the possible sea-fight tomorrow. Let us consider three ways (a, b and c below) of defining truth for statements like F(1)p:

(a) The first answer is that the two possibilities, sea-fight and no sea-fight, are both part of the future, and that none of them has any superior status relative to the other. This answer can be represented graphically in the following way:

The arrows on the end of the two future branches indicate that the statements ‘There is going to be a sea-battle (tomorrow)’ and ‘There is not going to be a sea-battle (tomorrow)’ are both true in this picture of branching time. That is, if we let p stand for ‘There is a sea-battle going on’, and F(1)p stand for ‘There is going to be a sea-battle tomorrow’, then

F(1)p \wedge F(1)~p

 

is true. The corresponding tense-logical system is called Kb after Saul Kripke.

(b) Prior named the Ockham-model after William of Ockham (c. 1285-1349), who in his logic had insisted that God knows the truth-value of every future contingent statement. According to this model, only one possible future is the true one, although we as human beings do not know which of them it is. Let us assume that there is in fact going to be no sea-fight tomorrow. In this case the future should be represented graphically in the following way, where a line not ending in an arrow indicates that it will be false to assert that the corresponding state-of-affairs will be the case tomorrow:

So, ~F(1)p \wedge F(1)~p is the true description of this situation, even though we may be unable to know this at the present moment (p, and so forth, being defined as above).

(c) Prior named the Peirce-model after Charles Sanders Peirce (1839-1914). According to this model, which Prior himself adopted as covering his own view, it makes no sense to speak about the true future as one of the possible futures. There is no future yet, just a number of possibilities. Hence, the future, or perhaps rather, the ‘hypothetical future,’ should be represented graphically in this way:

Neither F(1)p nor F(1)~p are true on this picture. However, if some proposition q holds tomorrow in all possible futures—that is, if the truth of q tomorrow is regarded as necessary—then F(1)q is true.

In order to describe the semantics for these tempo-modal systems in an more precise manner, Prior (1967, p. 126 ff.) needs a notion of temporal ‘routes’ or ‘temporal branches,’ that is, maximally ordered (that is, linear) subsets in (TIME,<). The term ‘chronicle’ is used in this article. Call C the set of all such chronicles.

An Ockhamistic valuation operator, Ock, can be defined in the structure (TIME,C,<). Given a truth-value for any propositional constant at any moment in TIME, Ock(t,c,p) can be defined recursively for any moment in any chronicle, t \in c:

(a)
(b)
(c)
(d)
(e)
Ock(t,c, p \wedge q) iff both Ock(t,c,p) and Ock(t,c,q)
Ock(t,c,~p) iff not Ock(t,c,p)
Ock(t,c,Fp) iff Ock(t,c,p) for some t\in c with t < t
Ock(t,c,Pp) iff Ock(t,c,p) for some t\in c with t< t
Ock(t,c,\Box p) iff Ock(t,c,p) for all c with t \in c’.

 

Ock(t,c,p) can be read ‘p is true at t in the chronicle c’. A formula p is said to be Ockham-valid if and only if Ock(t,c,p) for any t in any c in a branching time structure, (TIME,C,<).

It may be doubted whether Prior’s Ockhamistic system is in fact an adequate representation of the tense logical ideas propagated by William of Ockham. According to Ockham, God knows the contingent future, so it seems that he would accept an idea of absolute truth, also when regarding a statement Fq about the contingent future—and not only what Prior has called “prima-facie assignments” (1967, p. 126) like Ock(t,c,Fq). That is, such a proposition can be made true ‘by fiat’ simply by constructing a concrete structure that satisfies it. But Ockham would accept that Fq could be true at t without being relativised to any chronicle.

Now, let us turn to the Peirce system. In this system the truth-operator differs from the Ockhamistic operator when it comes to the evaluation of propositions on the form Fp. In this case the Peircean truth-operator can be defined as follows:

Peirce(t,Fp) if and only if for all c’ with t \in c: Ock(t,c, Fp)

 

Prior argued that this idea is included in Peirce’s philosophy. By analysing Peirce’s way of thinking and transferring this into the modern logic of time, Prior (1967, p. 132) found that in the Peircean system the following formula must hold for any proposition p:

~(F(x)p \wedge F(x)~p),

 

whereas its ‘excluded middle’ analogue

F(x)p \vee F(x)~p

 

does not hold in general. This is due to the fact that both assertions, F(x)p and F(x)~p, can be false, if they represent a pair of statements about the contingent future. It turns out, in the Peircean system F(x)p and \Box F(x)p are equivalent. It is also obvious that in this qHFq does not hold in general.

The discussion regarding the Ockhamistic versus the Peircean system was crucial for Prior in his attempts to deal with philosophical arguments in favour of determinism. His careful analyses of these systems were, however, not his only contribution to the further development of tense-logic. In fact, he studied a number of tense-logical systems corresponding to various notions of time (for instance, dense time, circular time, discrete time). He dealt with many of his findings in the paper, Recent Advances in Tense Logic, which was published shortly after his death in 1969.

e. The Logic of Existence

Prior was very interested in questions about time and existence. In particular, he discussed ideas of reality and quantification in the light of his temporal logic. He considered questions concerning the relation between logic and existence “the untidiest and most obscure part of tense-logic,” (1967, p. 172) and it was important to him to find solutions firmly grounded in tense-logic.

Among other things, Prior is famous for having introduced and defended the idea of presentism, that is, the position that only the present is real. In the paper The Notion of the Present, which he read at the launching of The International Society for the Study of Time in 1969, Prior offered this definition of presentism:

They [the present and the real] are one and the same concept, and the present simply is the real considered in relation to two particular species of unreality, namely the past and the future. (1970, p. 245)

The paper was published after his death based on his notes, and it has since become a well-received article among some presents. But, as pointed out by Oaklander, although philosophers such as William Lane Craig, John Bigelow and Robert Ludlow, “acknowledge their debt to Prior, \ldots [they] for one reason or another find his particular explication of presentism wanting“ (2002, p. 76-77). Prior’s definition has thus been much criticised for its rather radical implications for time and existence. Quentin Smith, himself a presentist, deems it “logically self-contradictory”:

If the real stands in relation to two particular species of the unreal, the unreal is real, since only something real can stand in relation to something. Unreality can no more stand in relation than it can possess monadic properties. (2002, p. 123)

However, as demonstrated in (Jakobsen 2011) it is evident from Prior’s notes that he struggled with formulating his definition in a satisfactory manner. It is important to emphasize that Prior was not stating that the present somehow stands in a relation to the unreal future and unreal past. It is rather that we understand what the present is, as we contrast it, in our mind, as the real to some present ideas of what isn’t real, namely ideas of the past and the future.

Prior’s ideas of presentism give rise to important questions regarding time and quantification. A key problem seems to be this: How can we quantify over future and past objects, if only the present exists? Prior considered the following example (1957a, p. 26):

(a)
(b)
It will be the case that someone is flying to the moon.
There is someone who will fly to the moon.

 

Here Prior understands (b) as, “There is someone presently existing who is going to fly to the moon”. If F stands for the future operator, the structure of (a) is F(\exists x: p) (that is, a quantification “within a modality”), whereas the formal structure of (b) is \exists x: Fp. The relation between statements like (a) and (b) had been studied by Ruth Barcan Marcus since 1946 in an attempt to combine modal logic with quantification theory. In particular, Ruth Barcan Marcus (1946) had studied systems in which the following formula holds:

F(\exists x:p)\exists x: Fp

 

This formula is now known as Barcan’s formula and it can be applied to all kinds of modal operators. Prior maintained that Barcan’s formula should not hold in general for the future operator. He wanted a clear logical distinction between quantification “within a modality” and quantification outside the scope of a modality.

However, Prior realised that for formal reasons it is rather difficult to keep the quantification within a modality. With just a few seemingly quite straight forward axioms of tense logic and Prior’s own general theory of quantification, Barcan’s formula for the future operator becomes provable (Jakobsen et al. 2011). As demonstrated by Philip Hugly and Charles Sayward (1996, p. 240), Prior has argued there are non-eliminable, non-substitutional, non-objectual, non-referential kinds of quantification. They have suggested that, following Prior’s ideas, quantification can be presented as “a method for constructing general sentences applicable to virtually any type or category of term” (1996, p. 265). Prior rejected the view suggested by Niko Cocchiarella according to whom it is acceptable to quantify over individual name-variables even when these names were now empty. Prior rejected the view primarily for metaphysical reasons. It seemed to him such a view would introduce a kind of waiting room from where future existents waited to be called to the scene (1957a, p. 158). According to Prior, quantification over possibilia or future existents cannot be done over individual name variables, since there aren’t any facts about them before they exist, and, if there are no facts about them, it means that they don’t exist. On the other hand, he found that it is fully acceptable to quantify over common nouns. In fact, as discussed in (Jakobsen and others 2011), he developed a so-called ε-calculus in order to deal with the logic of past and future objects.

A major challenge in dealing with non-existing objects has to do with the problem of statability. The point is that since new things have been brought into existence today, there are some statements which can be stated today, but which could not be stated yesterday. This was probably Prior’s main motivation for his proposal in 1957 of the modal system Q wherein it is assumed that in certain possible worlds, some propositions simply cannot occur. An example could be propositions directly concerned with individuals, which are absent from those worlds, since, according to Prior, no facts can be stated about an individual x except when x exists.

In 1959, Prior described the basic idea of the system Q with a hint to Wittgenstein, in the following way:

Nothing can be surer than that whereof we cannot speak, thereof we must be silent, though it does not follow from this that whereof we could not speak yesterday, thereof we must be silent today.

When translated into tense logical terms, the system Q offers an interesting example of a logical system which is, among other things, designed to solve problems associated with non-permanent or contingent existents.

It is interesting to study the problem of statability and its implications for the philosophy of time. However, it turns out to be a very difficult task to establish a tense logical formalism within which we can deal with the temporal aspects of statability in a satisfactory way (see for instance Wegener & Øhrstrøm 1997). However, the basic idea is evident when we are dealing with identifiable individuals. The very fact that individuals come into being makes it impossible for us to formulate crucial statements about such individuals in a satisfactory way before they have actually been brought into being. As Prior has pointed out, the statement, ‘It is not the case that Julius Caesar existed in 200 B.C.E.’ makes sense, but here it is important that the main verb is in the past and not in the present tense (2003, p. 92). In 200 B.C.E., a statement like ‘Julius Caesar does not exist’ would not make any sense. It was simply not statable then.

It may be argued that many future tense statements are not about particulars, but rather are about types. However, this observation certainly does not solve the problem of statability. Prior’s claim regarding non-statability is not only about the non-existence of subjects of predication. It is also a question about other parts of the vocabulary. The point is that new concepts, that is, new predicates, may arise. This means that the language of specification may be growing in a very radical manner.

Reflecting on the temporal aspects of statability, Prior (2003, p. 91) maintained that the passage of time not only means that more and more possibilities are lost. It also gives rise to new possibilities for us as new individuals come into being.

Furthermore, it should be mentioned that Prior was interested in the questions concerning identity of things over time. How can one thing at one time be the same as another thing at another time? How can a thing keep its identity over time? How can we be sure that individual things never split up into two (or more) identical individual things? Prior discussed these problems in a rather entertaining way in his, “The Fable of the Four Preachers” (available in Prior 2018). In addition, he analysed the problems formally, showing that we may easily run into serious troubles if we assume that one thing can become two things (2003, p. 96 ff.).

f. Four Grades of Tense-Logical Involvement

It was Peter Geach who sometime in the early 1960s made Prior aware of the importance and relevance of McTaggart’s distinction between the so-called A-series and B-series conceptions of time (1967, p. vi). Since then the notions and arguments in McTaggart’s paper, The Unreality of Time (1908), have become necessary ingredients of all major treatments of the philosophical problems related to temporal logic.

McTaggart’s A-series conception is based on the notions of past, present, and future, as opposed to a ‘tapestry’ view on time, as embodied by the B-series conception of time. Prior later formally elaborated McTaggart’s distinction, and showed that we can discuss time using either a tense logic, corresponding to the A-series conception, or using an earlier-later calculus, corresponding to the B-series conception. Prior’s interest in McTaggart’s observations was first aroused when he realised that McTaggart had offered an argument to the effect that the B-series presupposes the A-series rather than vice versa (1967, p. 2). Prior was particularly concerned with McTaggart’s argument against the reality of tenses. Prior’s studies brought renewed fame to this argument. In consequence, it has been very important in the philosophical debate about various kinds of temporal logic and their mutual relations.

Prior rejected McTaggart’s conclusion; and he held that the temporal world should in fact be described in terms of tenses (that is, McTaggart’s A-series). In his view, the alternative description of temporality in terms of earlier-later (that is, McTaggart’s B-series) was secondary. Prior clearly considered this tense-logical view to be the fundamental one when it comes to the study of time. On the other hand, he found that the relations between the A-series and the B-series are crucial when it comes to a deeper understanding of logic and time. In his studies of the relations between the A-series and the B-series, Prior introduced four grades of ‘tense logical involvement’. (See Prior 2003, p. 119 ff.)

The first grade defines tenses entirely in terms of objective instants and an earlier-later relation. For instance, a sentence such as Fp, ‘It will be the case that p’, is defined as a short-hand for ‘There exists some instant t which is later than now, and p is true at t’, and similarly for the past tense; these definitions are

(DF)
(DP)
T(t,Fp) \equiv_{def} \exists t_{1}: t<t_{1} \wedge T(t_{1},p)
T(t,Pp) \equiv_{def} \exists t_{1}: t_{1}<t \wedge T(t_{1},p)

 

Tenses, then, can be considered as mere meta-linguistic abbreviations, so this is the lowest grade of tense logical involvement. The tenses are simply seen as a handy way of summarizing the properties of the before-after relations, which constitute the B-theory. The tenses do not have any independent epistemological status. The basic idea is a definition of truth relative to temporal instants:

(T1)
(T2)
T(t,p \wedge q) \equiv (T(t,p) \wedge T(t,q))
T(t,~p) \equiv ~T(t,p)

 

In addition, there may be some specified properties of the before-after relation, like, for instance, transitivity:

(B1)
(t_{1} < t_{2} \wedge t_{2} < t_{3} )t_{1} < t_{3}

 

In this way, instants acquire an independent ontological status. As we have seen, Prior rejected the idea of temporal instants as something primitive and objective.

In the second grade of tense logical involvement, tenses are not reduced to B-series notions. Rather, they are treated on a par with the earlier-later relation. Specifically, a bare proposition p is treated as a syntactically full-fledged proposition, on a par with propositions such as T(t,p) (‘it is true at time t that p’). The point of the second grade is that a bare proposition with no explicit temporal reference is not to be viewed as an incomplete proposition. One consequence of this is that an expression such as T(t,T(t,p)) is also well-formed, and of the same type as T(t,p) and p. Prior showed how such a system leads to a number of theses, which relate tense logic to the earlier-later calculus, and vice versa. The following crucial rule of inference makes this relation within the second grade clear:

(RT)
If ├ p, then ├ T(t,p) for any t and any truth-operator T.

 
 
He also stated the following basic assumptions regarding the truth-operator:

(TX1)
(TX2)
(TX3)
(\forall t: T(t,p))p.
(\forall t_{1}: T(t_{1},p))T(t_{2},\forall t_{3}: T(t_{3},p))
T(t_{1},p)T(t_{2},T(t_{1},p))

 

According to the second grade of tense logical involvement, A-concepts and B-concepts are regarded as being on the same conceptual level. Neither set of concepts is conditioned by the other.

It may be a bit puzzling that p and T(t,p) can be treated as being on the same logical level, if one expects the former to belong to the logical language (or object language) and the latter to the semantics (or metalanguage). In Prior’s opinion, this is not at all surprising. In a paper on some problems of self-reference, he stated:

In other words, a language can contain its own semantics, that is to say its own theory of meaning, provided that this semantics contains the law that for any sentence x, x means that x is true. (1976b, p. 141)

This becomes even clearer in the third grade, according to which instants are seen as a special type of proposition. These instant-propositions describe the world uniquely, and are for this reason also called world-state propositions. Like Prior, let a, b, c \ldots be instant-propositions instead of t_{1}, t_{2}, \ldots. In fact, Prior assumed that such propositions are what ought to be meant by ‘instants’:

A world-state proposition in the tense-logical sense is simply an index of an instant; indeed, I would like to say that it is an instant, in the only sense in which ‘instants’ are not highly fictitious entities. (1967, p. 188-9)

The traditional distinction between the description of the content and the indication of time for an event is thereby dissolved. From the properties of the logical language which embodies the third grade of tense logical involvement, Prior also showed that T(a,p) can be defined in terms of a primitive necessity-operator. Then tense logic, and indeed, all of temporal logic can be developed from the purely ‘modal notions’ of past, present, future, and necessity.

This idea of treating instants as some kind of world propositions was one of Prior’s most interesting constructions. It has been taken up by Patrick Blackburn (2006), Torben Braüner (2011) and others. They have shown that Prior’s ideas can be further developed into very useful structures, which they have labelled hybrid logics.

The fourth grade consists in a tense logical definition of the necessity-operator such that the only primitive operators in the theory are the two tense logical ones: P and F. Prior himself favoured this fourth grade. It appears that his reasons for wanting to reduce modality to tenses were mainly metaphysical, since it has to do with his rejection of the concept of the (one) true (but still unknown) future. If one accepts the fourth grade of tense-logical involvement, it will turn out that something like the Peirce solution will be natural, and that we have to reject solutions like the Ockhamistic theory.

3. Conclusion

Prior dealt with many problems within philosophical logic, and it was very important for him to view logic as strongly related to reality. He held that logic “is not primarily about language, but about the real world” (Copeland 1996, p. 45). According to him, the tenses are essential for the understanding of reality. “I believe in the reality of the distinction between past, present, and future”, he claimed (Copeland 1996, p. 47). In fact, he held that tense logic is important not only in philosophy, but also in metaphysics and in physics. He argued that the physicist should understand that tense-logical questions ought to be taken into serious consideration in the development of relativistic physics and other parts of the natural sciences dealing with time. He claimed that in doing so the scientist and the logician may co-operate:

The logician must be rather like a lawyer—not in Toulmin’s sense, that of reasoning less rigorously than a mathematician—but in the sense that he is there to give the metaphysician, perhaps even the physicist, the tense logic that he wants, provided that it be consistent. He must tell his client what the consequences of a given choice will be \ldots and what alternatives are open to him; but I doubt whether he can, qua logician, do more (1967, p. 59).

During the last years of his life, Prior became very interested in the logical aspect of the notion of the self and in what he called ‘Egocentric Logic’. In fact, he was preparing the book Worlds, Times, and Selves, which Kit Fine finished after Prior’s death and published in 1976. A significant formal part of this work consists in developing the egocentric counterpart to ordinary tense or modal logic, whose crucial feature is the operator Q “that picks out those propositions that correspond to instants, worlds or selves, as the case may be” (1977, p. 8).

Prior’s most important achievement was his establishment of temporal logic as a research field within philosophical logic. He initiated a number of interesting studies within this new field, and he clearly demonstrated that temporal logic can be understood as having fundamental relations to essential problems in philosophy, theology and science (see e.g. Hasle et al. 2017).

This article is an elaboration and an update of Øhrstrøm, P. & Hasle, P.:“A.N. Prior’s Logic”. In Gabbay, D.; Woods, J. (Editors): Logic and the Modalities in the Twentieth Century. The Handbook of the History of Logic, Elsevier, Vol. 6, Chapter 5, pp. 323-71.

3. References and Further Reading

  • Barcan, Ruth C. 1946. “A Functional Calculus of First Order based on Strict Implication”, Journal of Symbolic Logic, 11, p. 2.
  • Becker, O. 1960, “Zur Rekonstruktion des Kyrieuon Logos des Diodorus Kronos (mit besonderer Rücksicht auf die Arbeiten von A. N. Prior)”, in Derbolav, J.; Nicolin, F. (Eds.), Erkenntnis und Verantwortung: Festschrift für Theodor Litt, Düsseldorf.
  • Blackburn, P. 2006. “Arthur Prior and Hybrid Logic”, Synthese, 150, pp. 329-72.
  • Braüner, T. 2011. Hybrid Logic and its Proof-Theory. Springer.
  • Copeland, Jack (Ed.) 1996. Logic and Reality: Essays on the Legacy of Prior, Oxford University Press.
  • Copeland, Jack 2002. “The Genesis of Possible Worlds Semantics”, Journal of Philosophical Logic, 31, pp. 99–137.
  • Findlay, J.N. 1941. “Time: A Treatment of Some Puzzles”, Australasian Journal of Psychology and Philosophy, 19, pp. 216-35.
  • Geach, P.T. 1970, “Arthur Prior: A Personal Impression”, Theoria, 3, pp. 186-8.
  • Hasle, P. 2003. “Life and Work of Arthur N. Prior: An Interview with Mary Prior”, in Prior 2003, pp. 293-310).
  • Hasle, P. 2012. “The problem of predestination: as a prelude to A.N. Prior’s tense logic”, Synthese, 188, October 2012, pp. 331-47.
  • Hasle, P., Blackburn, P. & Øhrstrøm, P. (Eds.) 2017. Logic and Philosophy of Time: Themes from Prior, Aalborg University Press.
  • Hugly, Philip & Sayward, Charles. 1996. Intensionality and Truth. An Essay on the Philosophy of A.N. Prior, Kluwer Academic Publishers, 1996.
  • Jakobsen, D., Øhrstrøm, P. and Schärfe, H. 2011. “A.N. Prior’s Ideas on Tensed Ontology”, in S. Andrews et al. (Eds.): ICCS 2011, LNAI 6828, pp. 118–30. Springer-Verlag Berlin Heidelberg.
  • Jakobsen, D. 2011. “A.N. Prior’s Notion of the Present”. In Time and Time Perception, LNCS 6789, pp. 36-45. Springer-Verlag Berlin Heidelberg.
  • Kenny, Anthony 1970. “Arthur Normann Prior (1914 – 1969)”, Proceedings of the British Academy, Vol. 56, pp. 321-49.
  • Lewis C.S. 1944, The Abolition of Man, Haper Collins Publ.
  • Łukasiewicz, Jan 1920. “On Three-Valued Logic”, reprinted in Borkowski, L. (Ed.) 1970. Jan Łukasiewicz: Selected Works, Amsterdam.
  • Łukasiewicz, Jan 1930. “Philosophical Remarks on Many-Valued Systems of Propositional Logic”, reprinted in Borkowski 1970 (above).
  • Mates, Benson 1949. “Diodorean Implication”, Philosophical Review, 58, 1949, pp. 234-44.
  • Mates, Benson 1961. Stoic Logic, University of California Press.
  • McTaggart, J.M.E. 1908. “The Unreality of Time”, Mind, 17, pp. 457-74.
  • Meredith, Carew and Prior, Arthur 1956. “Interpretations of Different Modal Logics in the ‘Property Calculus’”, University of Canterbury, reprinted in Copeland, J. (Ed.) 1996, pp. 133-4.
  • Ploug,T. & Øhrstrøm, P. 2012. “Branching time, indeterminism and tense logic”, Synthese, 188, pp. 367-79.
  • Prior, A.N. 1940. “Makers of Modern Thought (1): Kierkegaard”, The Student Movement, March 1940, pp. 131-32.
  • Prior, A.N. 1942. “Can Religion be Discussed?”, Australasian Journal of Psychology and Philosophy, 15 (1937), pp. 141-151, reprinted in Flew, A., MacIntyre, A. (Eds.) 1955, New Essays in Philosophical Theology, S. C. M. Press, London, pp. 1-11.
  • Prior, A.N. 1946. “The Reformers Reformed: Knox on Predestination”, The Presbyter, 4, pp. 19-23.
  • Prior, A.N.1949. Logic and the Basis of Ethics. Oxford University Press.
  • Prior, A.N. 1951, “The Ethical Copula”, Australasian Journal of Philosophy, 29, pp. 137-54, reprinted in Prior 1976b, p.9-24.
  • Prior, A.N. 1952. “Łukasiewicz’s Symbolic Logic”. Australasian Journal of Philosophy, 30, pp. 121-30.
  • Prior, A.N. 1953. “Three-valued Logic and Future Contingents”, The Philosophical Quarterly, 3, pp. 317-26.
  • Prior, A.N. 1955a. Formal Logic, Clarendon Press, Oxford.
  • Prior, A. N. 1955b. “Is Necessary Existence Possible?”, Philosophy and Phenomenological Research, 15, pp. 545-47.
  • Prior, A.N. 1955c. “Diodoran Modalities”, The Philosophical Quarterly, 5, pp. 205-13.
  • Prior, A.N. 1957a. Time and Modality, Oxford.
  • Prior, A.N. 1957b. “Symbolism and Analogy”, The Listener, April 25.
  • Prior, A.N. 1958. “The good life and religious faith” (East-West meeting at Canberra Dec. 1957), Australasian Journal of Philosophy, 36, pp. 1-13.
  • Prior, A.N. 1959. “Thank Goodness That’s Over”, Philosophy, 34, pp. 11-17.
  • Prior, A.N. 1962. Logic in England Today. Bodleian Library Box 5, 16 p. English original of “Wspolczesca logica w Anglii”, Ruch filozoficzny, 21 (1962), pp. 251-56.
  • Prior, A.N. 1967. Past, Present and Future, Oxford University Press.
  • Prior, A.N. 1969. “Recent Advances in Tense Logic”, The Monist, 53, pp. 325-39.
  • Prior, A.N. 1970. “The Notion of the Present”. Studium Generale, 23, pp. 245-248.
  • Prior, A.N. 1976a, The Doctrine of Propositions and Terms. Edited by P.T. Geach and A. J.P. Kenny, London.
  • Prior, A.N. 1976b. Papers in Logic and Ethics. Edited by P.T. Geach and A.J.P. Kenny, University of Massachusetts Press, Amherst, 1976.
  • Prior, A.N. and Fine, Kit 1977. Worlds, Times and Selves. University of Massachusetts Press/Duckworth, London. Based on manuscripts by Prior with a preface and a postscript by Kit Fine.
  • Prior, A.N. 2003. Papers on Time and Tense, 2nd Edition. Edited by Per Hasle, Peter Øhrstrøm, Torben Braüner, and Jack Copeland. Oxford University Press. (Available as open access from http://vbn.aau.dk/files/266668121/Logic_and_Philosophy_of_Themes_from_Prior_ONLINE.pdf)
  • Prior, A.N. 2018. Nachlass, http://nachlass.prior.aau.dk/.
  • Sedley, David 1977. “Diodorus Cronus and Hellenistic philosophy”, Proceedings Cambridge Philol. Soc., 203, pp. 74-120.
  • Wegener, M. & Øhrstrøm, P. 1997. “A New Tempo-Modal Logic for Emerging Truth”, in Jan Faye et al. (Eds.), Perspectives on Time, Boston Studies in the Philosophy of Science, 189. Kluwer Academic Publishers 1997, pp. 417-41.
  • Wright, Henrik von 1951. An Essay in Modal Logic, North-Holland Publ., Amsterdam.
  • Øhrstrøm, P.; Hasle, P., 1995. Temporal Logic – from Ancient Ideas to Artificial Intelligence. Kluwer Academic Publishers, Dordrecht.

 

Author Information

Peter Øhrstrøm
Email: poe@hum.aau.dk
Department of Communication and Psychology Aalborg University
Denmark

and

Per Frederik Vilhelm Hasle
Department of Information Studies University of Copenhagen
Denmark

and

David Jakobsen
Department of Communication and Psychology Aalborg University
Denmark

Higher-Order Theories of Consciousness

The most fundamental and commonly used notion of the term ‘conscious’ in philosophical circles is captured by Thomas Nagel’s famous “what it is like” sense (Nagel 1974). When I am in a conscious mental state, there is “something it is like” for me to be in that state from the subjective or first-person point of view. When I smell a rose or have a conscious visual experience, there is something it “seems” or “feels like” from my perspective. This is primarily the sense of “conscious state” that will be used throughout this entry. There is also something it is like to be a conscious creature whereas there is nothing it is like to be a table or tree.

Representational theories of consciousness attempt to reduce consciousness to “mental representations” rather than directly to neural or other physical states. This approach has been fairly popular over the past few decades. Examples include first-order representationalism (FOR) which attempts to explain conscious experience primarily in terms of world-directed (or first-order) intentional states (Tye 2005) as well as several versions of higher-order representationalism (HOR) which holds that what makes a mental state M conscious is that it is the object of some kind of higher-order mental state directed at M (Rosenthal 2005, Gennaro 2012). The primary focus of this entry is on HOR and especially higher-order thought (HOT) theory. The key question that should be answered by any theory of consciousness is: What makes a mental state a conscious mental state?

Section 1 introduces the overall representationalist approach to consciousness and briefly discuss Tye’s FOR. Section 2 presents three major versions of HOR: higher-order thought theory, dispositional higher-order thought theory, and higher-order perception theory. In section 3, a number of common and important objections and replies are presented. Section 4 briefly outlines a close connection between HOT theory and conceptualism, that is, the claim that the representational content of a perceptual experience is entirely determined by the conceptual capacities the perceiver brings to bear in her experience. Section 5 examines several hybrid higher-order and “self-representational” theories of consciousness which all hold that conscious states are self-directed in some way. Section 6 addresses the potentially damaging claim that HOT theory requires neural activity in the prefrontal cortex (PFC) in order for one to have conscious states.

Table of Contents

  1. Representationalism
  2. Higher-Order Representationalism
    1. Higher-Order Thought (HOT) Theory
    2. Dispositional HOT Theory
    3. Higher-Order Perception (HOP) Theory
  3. Objections and Replies
  4. HOT Theory and Conceptualism
  5. Hybrid Higher-Order and Self-Representational Theories
  6. HOT Theory and the Prefrontal Cortex
  7. References and Further Reading

1. Representationalism

Representational theories of consciousness reduce consciousness to “mental representations” rather than directly to neural states. Examples include first-order representationalism (FOR) which attempts to explain conscious experience primarily in terms of world-directed (or first-order) intentional states (Tye 2005) as well as several versions of higher-order representationalism (HOR) which holds that what makes a mental state M conscious is that it is the object of some kind of higher-order mental state directed at M (Rosenthal 2005, Gennaro 2012). The primary focus of this entry is on HOR and especially higher-order thought (HOT) theory. The key question that should be answered by any theory of consciousness is: What makes a mental state a conscious mental state?

Some current theories attempt to reduce consciousness in mentalistic terms, such as ‘thoughts’ and ‘awareness,’ rather than directly in neurophysiological terms. One popular approach is to reduce consciousness to mental representations of some kind. The notion of a “representation” is of course very general and can even be applied to pictures and signs. Much of what goes on in the brain might also be understood in a representational way. For example, mental events represent outer objects partly because they are caused by such objects in, say, cases of veridical visual perception. Philosophers often call such mental states “intentional states” which have representational content, that is, mental states that are “about” or “directed at” at something as when one has a thought about a house or a perception of a tree. Although intentional states, such as beliefs and thoughts, are sometimes contrasted with “phenomenal states,” such as pains and color experiences, it is clear that many conscious states have both phenomenal and intentional properties, such as in visual perceptions.

The general view that we can explain conscious mental states in terms of representational or intentional states is called “representationalism.” Although not automatically reductionistic, most versions of it do attempt such a reduction. Most representationalists believe that there is room for a second-step reduction to be filled in later by neuroscience. A related motivation for representational theories of consciousness is the belief that an account of intentionality can more easily be given in naturalistic terms, such as a causal theory whereby mental states are understood as representing outer objects via some reliable causal connection. The idea, then, is that if consciousness can be explained in representational terms and representation can be understood in purely physical terms, then there is the promise of a naturalistic theory of consciousness. Most generally, however, representationalism can be defined as the view that the phenomenal properties of conscious experience (that is, the “qualia”) can be explained in terms of the experiences’ representational properties.

It is worth noting here that the relationship between intentionality and consciousness is itself a major ongoing area of research with some arguing that genuine intentionality actually presupposes consciousness in some way (Searle 1992, Horgan and Tienson 2002). If this is correct, then it would be impossible to reduce consciousness to intentionality, but representationalists argue that consciousness requires intentionality, not vice versa. Of course, few if any today hold the very strong view Cartesian view that all intentional states are conscious. Descartes thought that mental states are essentially conscious and there are no unconscious mental states at all.  For much more on the relationship between intentionality and consciousness, see Gennaro (2012, chapter two), Chudnoff (2015), and the essays in Bayne and Montague (2011) and Kriegel (2013).

A first-order representational (FOR) theory of consciousness is one that attempts to explain and reduce conscious experience primarily in terms of world-directed (or first-order) intentional states. The two most cited FOR theories are those of Fred Dretske (1995) and Michael Tye (1995, 2000), but the emphasis here will be on Tye’s more developed theory.

Of course not all mental representations are conscious, so the key question remains: What exactly distinguishes conscious from unconscious mental states (or representations)? What makes an unconscious mental state a conscious mental state? Tye defends what he calls “PANIC theory.” The acronym “PANIC” stands for poised, abstract, non-conceptual, intentional content. Tye holds that at least some of the representational content in question is non-conceptual (N), which is to say that the subject can lack the concept for the properties represented by the experience in question, such as an experience of a certain shade of red that one has never seen before. But conscious states clearly must also have “intentional content” (IC) for any representationalist. Tye also asserts that such content is “abstract” (A) and so not necessarily about particular concrete objects. This is needed to handle hallucination cases where there are no concrete objects at all or cases where different objects look phenomenally alike. Perhaps most important for mental states to be conscious, however, is that such content must be “poised” (P), which is an importantly functional notion about what conscious states do. The “key idea is that experiences and feelings…stand ready and available to make a direct impact on beliefs and/or desires. For example…feeling hungry… has an immediate cognitive effect, namely, the desire to eat….States with nonconceptual content that are not so poised lack phenomenal character [because]…they arise too early, as it were, in the information processing” (Tye 2000, 62).

One common objection to FOR is that it does not apply to all conscious states. Some conscious states do not seem to be “about” or “directed at” anything, such as pains or anxiety, and so they would be non-representational conscious states. If so, then conscious states cannot generally be explained in terms of representational properties (Block 1996). Tye responds that pains and itches do represent in the sense that they represent parts of the body. Even hallucinations either misrepresent (which is still a kind of representation) or the conscious subject still takes them to have representational properties from the first-person point of view. Tye (2000) goes to great lengths in response to a host of alleged counter-examples to FOR. For example, with regard to conscious emotions, he says that they “are frequently localized in particular parts of the body. . . . For example, if one feels sudden jealousy, one is likely to feel one’s stomach sink . . . [or] one’s blood pressure increase” (Tye 2000, 51). He believes that something similar is true for fear or anger. Moods, however, are quite different and do not seem so easily localizable in the same way. Perhaps the most serious objection to Tye’s theory, however, is that what seems to be doing most of the work on Tye’s account is the extremely functional-sounding “poised” notion, and so he is arguably not really explaining phenomenal consciousness in entirely representational terms (Kriegel 2002). For other versions of FOR, see Harman (1990), Byrne (2001), and Droege (2003). Chalmers (2004) does an excellent job of presenting and categorizing the plethora of representationalist positions.

2. Higher-Order Representationalism

a. Higher-Order Thought (HOT) Theory

Once again, the key question is: What makes a mental state a conscious mental state? There is also a long tradition that has attempted to understand consciousness in terms of some kind of higher-order awareness (Locke 1689/1975). This view has been revived by several contemporary philosophers (Armstrong 1981, Rosenthal 1986, 1997, 2005, Lycan 1996, 2001, Gennaro 1996, 2012). The basic idea is that what makes a mental state conscious is that it is the object of some kind of higher-order representation (HOR). A mental state M becomes conscious when there is a HOR of M. A HOR is a “meta-psychological” or “meta-cognitive” state, that is, a mental state directed at another mental state (“I am in mental state M”). So, for example, my desire to write a good entry becomes conscious when I am (non-inferentially) “aware” of the desire. Intuitively, conscious states, as opposed to unconscious ones, are mental states that I am “aware of” being in some sense. Conscious mental states arise when two unconscious mental states are related in a certain way, namely, that one of them (the HOR) is directed at the other (M).

This overall idea is sometimes referred to as the Transitivity Principle (TP):

(TP) A conscious state is a state whose subject is, in some way, aware of being in it.

The corresponding idea that I could be having a conscious state while totally unaware of being in that state seems like a contradiction. A mental state of which the subject is completely unaware is clearly an unconscious state. For example, I would not be aware of having a subliminal perception and thus it is an unconscious perception. There are various kinds of HOR theory with the most common division between higher-order thought (HOT) theories and higher-order perception (HOP) theories. HOT theorists, such as David Rosenthal (2005), think it is better to understand the HOR (or higher-order “awareness”) as a thought containing concepts. HOTs are treated as cognitive states involving some kind of conceptual component. HOP theorists (Lycan 1996) urge that the HOR is a perceptual state of some kind which does not require the kind of conceptual content invoked by HOT theorists. Although HOT and HOP theorists agree on the need for a HOR theory of consciousness, they do sometimes argue for the superiority of their respective positions (Rosenthal 2004, Lycan 2004, Gennaro 2012, chapter three).

One can also find something like TP in premise 1 of Lycan’s (2001) more general argument for HOR. The entire argument runs as follows:

(1) A conscious state is a mental state whose subject is aware of being in it.

(2) The “of” in (1) is the “of” of intentionality; what one is aware of is an intentional object of the awareness.

(3) Intentionality is representational; a state has a thing as its intentional object only if it represents that thing.

Therefore,

(4) Awareness of a mental state is a representation of that state. (From 2, 3)

Therefore,

(5) A conscious state is a state that is itself represented by another of the subject’s mental states. (1, 4)

The intuitive appeal of premise 1 leads naturally to the final conclusion— (5)—which is just another way of stating HOR.

A related rationale for HOR, and HOT theory in particular, can be put as follows (based on Rosenthal 2004): A non-HOT theorist might still agree with HOT theory as an account of introspection or reflection , namely, that it involves a conscious thought about a mental state. This seems to be a fairly common sense definition of introspection that includes the notion that introspection involves conceptual activity. It also seems reasonable for anyone to hold that when a mental state is unconscious, there is no HOT at all. But then it stands to reason that there should be something “in between” those two cases, that is, when one has a first-order conscious state. So what is in between no HOT at all and a conscious HOT? The answer is an unconscious HOT, which is precisely what HOT theory says, that is, a first-order conscious state is accompanied by an unconscious HOT. Moreover, this explains what happens when there is a transition from a first-order conscious state to an introspective state: an unconscious HOT becomes conscious.

Still, it might still seem that HOT theory results in circularity by defining consciousness in terms of HOTs. It also might seem that an infinite regress results because a conscious mental state must be accompanied by a HOT, which, in turn, must be accompanied by another HOT ad infinitum. However, as we have just seen, the standard and widely accepted reply is that when a conscious mental state is a first-order world-directed state the higher-order thought (HOT) is not itself conscious. But when the HOT is itself conscious, there is a yet higher-order (or third-order) thought directed at the second-order state. In this case, we have introspection which involves a conscious HOT directed at an inner mental state. When one introspects, one’s attention is directed back into one’s mind. For example, what makes my desire to write a good chapter a conscious first-order desire is that there is a (non-conscious) HOT directed at the desire. In this case, my conscious focus is directed outwardly at the paper or computer screen, so I am not consciously aware of having the HOT from the first-person point of view. When I introspect that desire, however, I then have a conscious HOT (accompanied by a yet higher, third-order, HOT) directed at the desire itself (Rosenthal 1986, 1997). Indeed, it is crucial to distinguish first-order conscious states (with unconscious HOTs) from introspective states (with conscious HOTs).
figure 1
HOT theorists do insist that the HOT must become aware of the lower-order (LO) state noninferentially in order to make it conscious. The point of this condition is mainly to rule out alleged counterexamples to HO theory, such as cases where I become aware of my unconscious desire to kill my boss because I have consciously inferred it from a session with a psychiatrist, or where my anger becomes conscious after making inferences based on my own behavior. The characteristic feel of such a conscious desire or anger may be absent in these cases, but since awareness of them arose via conscious inference, the HO theorist accounts for them by adding this noninferential condition.

b. Dispositional HOT Theory

Peter Carruthers (2000, 2005) has proposed a different form of HOT theory such that the HOTs are dispositional states instead of actual HOTs, though he also understands his “dispositional HOT theory” to be a form of HOP theory (Carruthers 2004). The basic idea is that the conscious status of an experience is due to its availability to higher-order thought. So “conscious experience occurs when perceptual contents are fed into a special short-term buffer memory store, whose function is to make those contents available to cause HOTs about themselves” (Carruthers 2000, 228). Some first-order perceptual contents are available to a higher-order “theory of mind mechanism,” which transforms those representational contents into conscious contents. Thus, no actual HOT occurs. Instead, according to Carruthers, some perceptual states acquire a dual intentional content, for example, a conscious experience of red not only has a first-order content of “red,” but also has the higher-order content “seems red” or “experience of red.” Thus, he also calls his theory “dual-content theory.” Carruthers makes interesting use of so-called “consumer semantics” in order to fill out his theory of phenomenal consciousness. That is, the content of a mental state depends, in part, on the powers of the organisms which “consume” that state, for example, the kinds of inferences which the organism can make when it is in that state.

Dispositional HOT theory is often criticized by those who do not see how the mere disposition toward a mental state can render it conscious (Rosenthal 2004). Recall that a key motivation for HOT theory is the Transitivity Principle (TP) but the TP clearly lends itself to an actualist HOT theory interpretation, namely, that we are aware of our conscious states and not aware of our unconscious states. And, as Rosenthal puts it, “Being disposed to have a thought about something doesn’t make one conscious of that thing, but only potentially conscious of it” (2004, 28). Thus it is natural to wonder just how dual-content theory explains phenomenal consciousness. It is difficult to understand how a dispositional HOT can render, say, a perceptual state actually conscious.

Carruthers is well aware of this objection and attempts to address it (Carruthers 2005, 55-60). He again relies heavily on consumer semantics in an attempt to show that changes in consumer systems can transform perceptual contents. That is, what a state represents will depend, in part, on the kinds of inferences that the cognitive system is prepared to make in the presence of that state, or on the kinds of behavioral control that it can exert. In that case, the presence of first-order perceptual representations to a consumer-system that can deploy a “theory of mind” and concepts of experience may be sufficient to render those representations at the same time as higher-order ones. This would confer phenomenal consciousness to such states. But the central and most serious problem remains: that is, dual-content theory is vulnerable to the same objection raised against FOR. This point is made most forcefully by Jehle and Kriegel (2006). They point out that dual-content theory “falls prey to the same problem that bedevils FOR: It attempts to account for the difference between conscious and [un]conscious . . . mental states purely in terms of the functional roles of those states” (Jehle and Kriegel 2006, 468). Carruthers, however, is more concerned to avoid what he takes to be a problem for “actualist” HOT theory, namely, that an unbelievably large amount of cognitive (and neural) space would have to be taken up if every conscious experience is accompanied by an actual HOT.

c. Higher-Order Perception (HOP) Theory

David Armstrong (1981) and William Lycan (1996, 2004) have been the leading proponents of HOP theory in recent decades. Unlike HOTs, HOPs are not thoughts and do not have conceptual content. Rather, they are to be understood as analogous to outer perception. One major objection to HOP theory is that, unlike outer perception, there is no obvious distinct sense organ or scanning mechanism responsible for HOPs. Similarly, no distinctive sensory quality or phenomenology is involved in having HOPs whereas outer perception always involves some sensory quality. Lycan concedes the disanalogy but argues that it does not outweigh other considerations favoring HOP theory. His reply is understandable, but the objection remains a serious one and the disanalogy cannot be overstated.

Gennaro argues against Lycan’s claim that HOP theory is superior to HOT theory because, by analogy to outer perception, there is an importantly passive aspect to perception not found in thought (Gennaro 2012, chapter three). The perceptions in HOPs are too passive to account for the interrelation between HORs and first-order states. Thus, HOTs are preferable. Gennaro sometimes frames it in Kantian terms: we can distinguish between the faculties of sensibility and understanding, which must work together to make experience possible. What is most relevant here is that the passive nature of the “sensibility” (through which outer objects are given to us) is contrasted with the active and cognitive nature of the “understanding,” which thinks about and applies concepts to that which enters via the sensibility. HOTs fit this latter description better than HOPs. In any case, what ultimately justifies treating HORs as thoughts is the exercise and application of concepts to first-order states (Rosenthal 2005, Gennaro 2012, chapter four).

More recently, however, Lycan has changed his mind and no longer holds HOP theory mainly because he now thinks that attention to first-order states is sufficient for an account of conscious states and there is little reason to view the relevant attentional mechanism as intentional or as representing first-order states (Sauret and Lycan 2014). Armstrong and Lycan had indeed previously spoken of HOP “monitors” or “scanners” as a kind of attentional mechanism but now it seems that “…leading contemporary cognitive and neurological theories of attention are unanimous in suggesting that attention is not intentional” (Sauret and Lycan 2014, 365). They cite Prinz (2012), for example, who holds that attention is a psychological process that connects first-order states with working memory. Sauret and Lycan explain that “attention is the mechanism that enables subjects to become aware of their mental states” (2014, 367) and yet this “awareness of” is supposed to be a non-intentional selection of mental states. Thus, Sauret and Lycan (2014) find that Lycan’s (2001) earlier argument, discussed above, goes wrong at premise 2 and that the “of” in question need not be the “of” of intentionality. Instead, the ‘of’ is perhaps more of an “acquaintance relation” although Sauret and Lycan do not really present a theory of acquaintance, let alone one with the level of detail offered by HOT theory.

Gennaro (2015a) offers reasons to doubt that the acquaintance strategy is a better alternative. Such acquaintance relations would presumably be somehow “closer” than the representational relation. But this strategy is arguably at best trading one difficult problem for an even deeper puzzle, namely, just how to understand the allegedly intimate and nonrepresentational “awareness of” relation between HORs and first-order states. It is also more difficult to understand such “acquaintance relations” within the context of any HOR reductionist approach. Indeed, acquaintance is often taken to be unanalyzable and simple in which case it is difficult to see how it could usefully explain anything, let alone the nature of conscious states. Zahavi (2007), who is not a HOT or HOP theorist, also recognizes how unsatisfying invoking ‘acquaintance’ can be. It remains unclear as to what this acquaintance relation is supposed to be. For other variations on HOT theory, see Rolls (2004), Picciuto (2011), and Coleman (2015).

3. Objections and Replies

Several prominent objections to HOR (and counter-replies) can be found in the literature. Although some also apply to HOP theory, others are aimed more specifically at HOT theory.

First, some argue that various animals (and even infants) are not likely to have to the conceptual sophistication required for HOTs, and so that would render animal (and infant) consciousness very unlikely (Dretske 1995, Seager 2004). Are cats and dogs capable of having complex higher-order thoughts such as “I am in mental state M”? Although most who bring forth this objection are not HO theorists, Carruthers (1989, 2000) is one HO theorist who actually embraces the conclusion that (most) animals do not have phenomenal consciousness.

However, perhaps HOTs need not be as sophisticated as it might initially appear, not to mention some comparative neurophysiological and experimental evidence supporting the conclusion that animals have conscious mental states (Gennaro 1993, 1996). Most HO theorists do not wish to accept the absence of animal or infant consciousness as a consequence of holding the theory. The debate has continued over the past two decades (see for example, Carruthers 2000, 2005, 2008, 2009, and Gennaro 2004b, 2009, 2012, chapters eight). To give an example which seems to favor animal HOTs, Clayton and Dickinson and their colleagues (in Clayton, Bussey, and Dickinson 2003) have reported convincing demonstrations of memory for time in scrub jays. Scrub jays are food-caching birds, and when they have food they cannot eat, they hide it and recover it later. Because some of the food is preferred but perishable (such as crickets), it must be eaten within a few days, while other food (such as nuts) is less preferred but does not perish as quickly. In cleverly designed experiments using these facts, scrub jays are shown, even days after caching, to know not only what kind of food was where but also when they had cached it (see also Clayton, Emery, and Dickinson 2006). Such experimental results seem to show that they have episodic memory which involves a sense of self over time. This strongly suggests that the birds have some degree of meta-cognition with a self-concept (or “I-concept”) which can figure into HOTs. Further, many crows and scrub jays return alone to caches they had hidden in the presence of others and recache them in new places (Emery and Clayton 2001). This suggests that they know that others know where the food is cached, and thus, to avoid having their food stolen, they recache the food. This strongly suggests that these birds can have some mental concepts, not only about their own minds but even of other minds, which is sometimes referred to as “mindreading” ability. Of course, there are many different experiments aimed at determining the conceptual and meta-cognitive abilities of various animals so it is difficult to generalize across species.

There does seem to be growing evidence that at least some animals can mind-read under familiar conditions. For example, Laurie Santos and colleagues show that rhesus monkeys attribute visual and auditory perceptions to others in more competitive paradigms (Flombaum and Santos 2005, Santos, Nissen, and Ferrugia 2006). Rhesus monkeys preferentially attempted to obtain food silently only in conditions in which silence was relevant to obtaining the food undetected. While a human competitor was looking away, monkeys would take grapes from a silent container, thus apparently understanding that hearing leads to knowing on the part of human competitors. Subjects reliably picked the container that did not alert the experimenter that a grape was being removed. This suggests that monkeys take into account how auditory information can change the knowledge state of the experimenter (see also for example the essays in Terrace and Metcalfe 2005). Some of these same issues arise with respect to infant concept possession and consciousness (see Gennaro 2012, chapter seven, Goldman 2006, Nichols and Stich 2003, but also Carruthers 2009).

A second objection to has been referred to as the “problem of the rock” and is originally due to Alvin Goldman (Goldman 1993). When I have a thought about a rock, it is certainly not true that the rock becomes conscious. So why should I suppose that a mental state becomes conscious when I think about it? This is puzzling to many and the objection forces HOT theorists to explain just how adding the HOT state changes an unconscious state into a conscious. There have been, however, a number of responses to this kind of objection (Rosenthal 1997, Van Gulick 2000, 2004, Gennaro 2005, 2012, chapter four). Perhaps the most common theme is that there is a principled difference in the objects of the thoughts in question. For one thing, rocks and similar objects are not mental states in the first place, and HOT theorists are first and foremost trying to explain how a mental state becomes conscious. The objects of the HOTs must be “in the head.”

Third, one might object to any reductionist theory of consciousness with something like Chalmers’ hard problem, that is, how or why brain activity produces conscious experience (Chalmers 1995). However, it is first important to keep in mind that HOT theory is unlike reductionist accounts in non-mentalistic terms and so is arguably immune to Chalmers’s criticism about the plausibility of theories which attempt a direct reduction to neurophysiology (Gennaro 2005). On HOT theory, there is no problem about how a specific brain activity “produces” conscious experience, nor is there an issue about any a priori or a posteriori relation between brains and consciousness. The issue instead is how HOT theory might be realized in our brains for which there seems to be some evidence thus far (Gennaro 2012, chapters four and nine).

Still, it might be asked just how exactly any HOR theory really explains the subjective or phenomenal aspect of conscious experience. How or why does a mental state come to have a first-person qualitative “what it is like” aspect by virtue of the presence of a HOR directed at it? HOR theorists have been slow to address this problem though a number of overlapping responses have emerged. Some argue that this objection misconstrues the main and more modest purpose of their HOT theories. The claim is that HOT theories are theories of consciousness only in the sense that they are attempting to explain what differentiates conscious from unconscious states, that is, in terms of a higher-order awareness of some kind. A full account of “qualitative properties” or “sensory qualities” (which can themselves be unconscious) can be found elsewhere in their work, but is independent of their theory of consciousness (Rosenthal 1991, 2005, Lycan 1996). Thus, a full explanation of phenomenal consciousness does require more than a HOR theory but that is no objection to HOR theories as such. There is also a concern that proponents of the hard problem unjustly raise the bar as to what would count as a viable reductionist explanation of consciousness so that any such reductionist attempt would inevitably fall short (Carruthers 2000). Part of the problem may even be a lack of clarity about what would count as an explanation of consciousness (Van Gulick 1995).

Gennaro responds that HOTs explain how conscious states occur because the concepts that figure into the HOTs are necessarily presupposed in conscious experience (Gennaro 2012, chapter four, 2005). The idea is that first we receive information via our senses (or the “faculty of sensibility”). Some of this information will then rise to the level of unconscious mental states but they do not become conscious until the more cognitive “faculty of understanding” operates on them via the application of concepts. We can arguably understand such concept application in terms of HOTs directed at first-order states. Thus, I consciously experience (and recognize) the blue house as a blue house partly because I apply the concepts “blue” and “house” (in my HOTs) to my basic perceptual states. Gennaro urges that if there is a real hard problem, it has more to do with explaining concept acquisition (Gennaro 2012, chapters six and seven).

A fourth, and very important, objection to higher-order approaches is the question of how such theories can explain cases where the HO state might misrepresent the lower-order (LO) mental state (Byrne 1997, Neander 1998, Levine 2001, Block 2011). After all, if we have a representational relation between two states, it seems possible for misrepresentation or malfunction to occur. If it does, then what explanation can be offered by the HO theorist? If my LO state registers a red percept and my HO state registers a thought about something green, then what happens? It seems that problems loom for any answer given by a HOT theorist and the cause of the problem has to do with the very nature of the HO theorist’s belief that there is a representational relation between the LO and HO states. For example, if a HOT theorist takes the option that the resulting conscious experience is reddish, then it seems that the HOT plays no role in determining the qualitative character of the experience. On the other hand, if the resulting experience is greenish, then the LO state seems irrelevant. Nonetheless, Rosenthal and Weisberg hold that the HOT determines the qualitative properties, even in so-called “targetless” or “empty” HOT cases where there is no LO state at all (Rosenthal 2005, 2011, Weisberg 2008, 2011).

Gennaro argues instead that no conscious color experience would result in such cases, that is, neither reddish nor greenish experience especially since, for example, it is difficult to see how a sole (unconscious) HOT can result in a conscious state at all (Gennaro 2012, chapter four, 2013). He argues that there must be a conceptual match, complete or partial, between the LO and HO state in order for the conscious experience to exist in the first place. Weisberg and Rosenthal argue that what really matters is how things seem to the subject and, if we can explain that, we have explained all that we need to. But the problem here is that somehow the HOT alone is what matters. Doesn’t this defeat the purpose of HOT theory which is supposed to explain state consciousness in terms of the relation between two states? Moreover, according to the theory, the lower-order state is supposed to be conscious when one has an unconscious HOT.

In the end, Gennaro argues for the more nuanced claim that:

Whenever a subject S has a HOT directed at experience e, the content c of S’s HOT determines the way that S experiences e (provided that there is a full or partial conceptual match with the lower-order state, or when the HO state contains more specific or fine-grained concepts than the LO state has, or when the LO state contains more specific or fine-grained concepts than the HO state has, or when the HO concepts can combine to match the LO concept) (Gennaro 2012, 180).

The reasons for the above qualifications are discussed in Gennaro (2012, chapter six) but they basically try to explain what happens in some abnormal cases (such as visual agnosia) and in some other atypical contexts (such as perceiving ambiguous figures such as the vase-two faces) where mismatches might occur between the HOT and LO state. For example, visual agnosia, or more specifically associative agnosia, seems to be a case where a subject has a conscious experience of an object without any conceptualization of the incoming visual information (Farah 2004). There appears to be a first-order perception of an object without the accompanying concept of that object (either first- or second-order, for that matter). Thus its “meaning” is gone and the object is not recognized. It seems that there can be conscious perceptions of objects without the application of concepts, that is, without recognition or identification of those objects. But one might instead hold that associative agnosia is simply an unusual case where the typical HOT does not fully match up with the first-order visual input. That is, we might view associative agnosia as a case where the “normal,” or most general, object concept in the HOT does not accompany the input received through the visual modality. There is a partial match instead. A HOT might partially recognize the LO state. So associative agnosia would be a case where the LO state could still register a percept of an object O (because the subject still does have the concept), but the HO state is limited to some features of O. Bare visual perception remains intact in the LO state but is confused and ambiguous, and thus the agnosic’s conscious experience of O “loses meaning,” resulting in a different phenomenological experience. When, for example, the agnosic does not (visually) recognize a whistle as a whistle, perhaps only the concepts ‘silver,’ ‘roundish,’ and ‘object’ are applied. But as long as that is how the agnosic experiences the object, then HOT theory is left unthreatened.

In any case, on Gennaro’s view, misrepresentations cannot occur between M and HOT and still result in a conscious state (Gennaro 2012, 2013). Misrepresentations cannot occur between M and HOT and result in a conscious experience reflecting mismatched and incompatible concepts.

A final kind of objection worth mentioning has to do with various pathologies of self-awareness, such as somatoparaphrenia which is a pathology of self characterized by the sense of alienation from parts of one’s body.  It is a bizarre type of body delusion where one denies ownership of a limb or an entire side of one’s body. It is sometimes called a “depersonalization disorder.” Relatedly, anosognosia is a condition in which a person who suffers from a disability seems unaware of the existence of the disability. A person whose limbs are paralyzed will insist that his limbs are moving and will become furious when family and caregivers say that they are not. Somatoparaphrenia is usually caused by extensive right-hemisphere lesions, most commonly in the temporoparietal junction (Valler and Ronchi 2009). Patients with somatoparaphrenia say some very strange things, such as “parts of my body feel as if they didn’t belong to me” (Sierra and Berrios 2000, 160) and “when a part of my body hurts, I feel so detached from the pain that it feels as if it were somebody else’s pain” (Sierra and Berrios 2000, 163). It is difficult to grasp what having these conscious thoughts and experiences are like.

There is some question as to whether or not the higher-order thought (HOT) theory of consciousness can plausibly account for the depersonalization psychopathology of somatoparaphrenia (Liang and Lane 2009, Rosenthal 2010, Lane and Liang 2010). Liang and Lane (2009) argue that it cannot. HOT theory has been critically examined in light of some psychopathologies because, according to HOT theory, what makes a mental state conscious is a HOT of the form that “I am in mental state M.” The requirement of an I-reference leads some to think that HOT theory cannot explain since there would seem to be cases where I can have a conscious state and not attribute it to myself (and instead to someone else). Liang and Lane (2009) initially argued that somatoparaphrenia threatens HOT theory because it contradicts the notion that the accompanying HOT that “I am in mental state M.” The “I” is not only importantly self-referential but essential in tying the conscious state to oneself and, thus, to one’s ownership of M.

Rosenthal (2010) basically responds that one can be aware of bodily sensations in two ways that, normally at least, go together: (1) aware of a bodily sensation as one’s own, and (2) aware of a bodily sensation as having some bodily location, like a hand or foot. Patients with somatoparaphrenia still experience the sensation as their own but also as having a mistaken bodily location (perhaps somewhat analogous to phantom limb pain where patients experience pain in missing limbs). Such patients still do have the awareness in (1), which is the main issue at hand, but they have the strange awareness in sense (2). So somatoparaphrenia leads some people to misidentify the bodily location of a sensation as some­one else’s, but the awareness of the sensation itself remains one’s own. Lane and Liang (2010) are not satisfied and, among other things, counter that Rosenthal’s analogy to phantom limbs is faulty, and that he has still not explained why the identification of the bearer of the pain can­not also go astray.

Among other things, Gennaro (Gennaro 2015b replies first that we must remember that many of these patients often deny feel­ing anything in the limb in question (Bottini et al. 2002). As Liang and Lane point out, patient FB (Bottini et al. 2002), while blindfolded, feels “no tactile sensation” (2009, 664) when the examiner would in fact touch the dorsal surface of FB’s hand. In these cases, it is particularly difficult to see what the problem is for HOT theory at all. But when there really is a bodily sensation of some kind, a HOT theorist might also argue that there are really two conscious states that seem to be at odds. There is a conscious feeling in a limb but also the (conscious) attribution of the limb to someone else. It is crucial to emphasize that somatoparaphrenia is often characterized as a delusion of belief often under the broader category of anosognosia. A delusion is often defined as a false belief that is held based on an incorrect (and probably unconscious) inference about external reality or one­self that is firmly sustained despite what almost everyone else believes and despite what constitutes incontrovertible and obvious proof or evidence to the contrary (Bortolotti 2009, Radden 2010). In some cases, delusions seriously inhibit normal day-to-day functioning. Beliefs are often taken to be intentional states integrated with other beliefs. They are typically understood as caused by perceptions or experiences that then lead to action or behavior. Thus, somatoparaphrenia is, in some ways, closer to self-deception and involves frequent confabulation. For more on this disagreement as well as the phenomenon of thought insertion in schizophrenia, see Lane (2015) as well.

4. HOT Theory and Conceptualism

Consider again the related claim that HOT theory can explain how one’s conceptual repertoire can transform our phenomenological experience. Concepts, at minimum, involve recognizing and understanding objects and properties. Having a concept C should also give the concept possessor the ability to discriminate instances of C and non-C’s. For example, if I have the concept ‘tiger’ I should be able to identify tigers and distinguish them from other even fairly similar land animals. Rosenthal invokes the idea that acquiring concepts can change one’s conscious experience with the help of several well-known examples (2005, 187-188). Acquiring various concepts from a wine-tasting course will lead to different experiences from those taste experiences enjoyed prior to the course. I acquire more fine-grained wine-related concepts, such as “dry” and “heavy,” which in turn can figure into my HOTs and thus alter my conscious experiences. I literally have different qualia due to the change in my conceptual repertoire. As we learn more concepts, we have more fine-grained experiences and thus experience more qualitative complexities. A botanist will likely have somewhat different perceptual experiences than I do while walking through a forest. Conversely, those with a more limited conceptual repertoire, such as infants and animals, will often have a more coarse-grained set of experiences. Much the same goes for other sensory modalities, such as the way that I experience a painting after learning more about artwork and color. The notion of “seeing-as” (“hearing-as” and so on) is often used in this context, that is, when I possess different concepts I literally experience the world differently.

Thus, Gennaro argues that there is a very close and natural connection between HOT theory and what is known as “conceptualism” (Gennaro 2012, chapter six, 2013). Chuard (2007) defines conceptualism as the claim that “the representational content of a perceptual experience is fully conceptual in the sense that what the experience represents (and how it represents it) is entirely determined by the conceptual capacities the perceiver brings to bear in her experience” (Chuard 2007, 25). In any case, the basic idea is that, just like beliefs and thoughts, perceptual experiences also have conceptual content. In a somewhat Kantian spirit, one might say that all conscious experience presupposes the application of concepts, or, even stronger, the way that one experiences the world is entirely determined by the concepts one possesses. Indeed, Gunther (2003, 1) initially uses Kant’s famous slogan that “thoughts without content are empty, intuitions [= sensory experiences] without concepts are blind” to sum up conceptualism (Kant 1781/1965, A51/B75).

5. Hybrid Higher-Order and Self-Representational Theories

Some related representationalist views hold that the HOR in question should be understood as intrinsic to (or part of) an overall complex conscious state. This stands in contrast, for example, to the standard view that the HOT is extrinsic to (that is, entirely distinct from) its target mental state. One motivation for this shift is renewed interest in a view somewhat closer to the one held by Franz Brentano (1874/1973) and others, normally associated with the phenomenological tradition (Sartre 1956, Smith 2004). To varying degrees, these theories have in common the idea that conscious mental states, in some sense, represent themselves, which still involves having a thought about a mental state but just not a distinct or separate state. Thus, when one has a conscious desire for a beer, one is also aware that one is in that very state. The conscious desire represents both the beer and itself. It is this “self-representing” which makes the state conscious.

Gennaro has argued that, when one has a first-order conscious state, the (unconscious) HOT is better viewed as intrinsic to the target state, so that we have a complex conscious state with parts (Gennaro 1996, 2006, 2012). This is what he calls the “wide intrinsicality view” (WIV) which he takes to be a version of HOT theory and argues elsewhere that Sartre’s theory of consciousness could be understood in this way (Gennaro 2002, 2015). On the WIV, first-order conscious states are complex states with a world-directed part and a meta-psychological component. Robert Van Gulick (2000, 2004, 2006) has also explored the alternative that the HO state is part of an overall global conscious state. He calls such states “HOGS” (Higher-Order Global States) whereby a lower-order unconscious state is “recruited” into a larger state, which becomes conscious partly due to the implicit self-awareness that one is in the lower-order state.

This general approach is also forcefully advocated by Uriah Kriegel in a series of papers, beginning with Kriegel (2003) and culminating in Kriegel (2009). He refers to it as the “self-representational theory of consciousness” (see also Kriegel and Williford 2006). To be sure, the notion of a mental state representing itself or a mental state with one part representing another part is in need of further development. Nonetheless, there is agreement among all of these authors that conscious mental states are, in some important sense, reflexive or self-directed.

More specifically, Kriegel (2003, 2006, 2009) has tried to cash out TP in terms of a ubiquitous (conscious) “peripheral” self-awareness which accompanies all of our first-order focal conscious states. Not all conscious “directedness” is attentive and so perhaps we should not restrict conscious directedness to that which we are consciously focused on. If this is right, then a first-order conscious state can be both attentively outer-directed and inattentively inner-directed. Gennaro has argued against this view at length (Gennaro 2008, Gennaro 2012, chapter five). For example, although it is surely true that there are degrees of conscious attention, the clearest example of genuine “inattentive” consciousness is outer-directed awareness in one’s peripheral visual field. But this obviously does not show that any inattentional consciousness is self-directed during outer-directed consciousness, let alone at the very same time. Also, what is the evidence for such self-directed inattentional consciousness? It is presumably based on phenomenological considerations but he claims not to find such ubiquitous inattentive self-directed “consciousness” in his outer-directed conscious experience. Except when he is introspecting, Gennaro thinks that conscious experience is so completely outer directed that there really is no such peripheral self-directed consciousness when in first-order conscious states. He says that it does not seem to him that he is consciously aware of his own experience when, say, consciously attending to a band in concert or to the task of building a bookcase. Even some who are otherwise very sympathetic to Kriegel’s phenomenological approach find it difficult to believe that “pre-reflective” (inattentional) self-awareness accompanies conscious states (Siewart 1998, Zahavi 2004) or at least that all conscious states involve such self-awareness (Smith 2004). Self-representationalism is also a target of the objection discussed in section 3 regarding somatoparaphrenia and related deficits of self-awareness (for more on this dispute, see Lane 2015 and Billon and Kriegel 2015).

In the end, Kriegel actually holds that there is an indirect self-representation applicable to conscious states with the self-representational peripheral component directed at the world-directed part of the state (2009, 215-226). This seems closer to Gennaro’s WIV but Kriegel thinks that “pre-reflective self-awareness” or the “self-representation” is itself (peripherally) conscious. For others who hold some form of the self-representational view, see Williford (2006) and Janzen (2008). Carruthers’ (2000, 2005) theory can also be viewed in this light since, as we have seen, he contends that conscious states have two representational contents.

6. HOT Theory and the Prefrontal Cortex

An interesting topic in recent years has focused on attempts to identify just how HOT theory and self-representationalism might be realized in the brain. We have seen that most representationalists tend to think that the structure of conscious states is realized in the brain (though it may take some time to identify all the main neural structures). The issue is sometimes framed in terms of the question: “how global is HOT theory?” That is, do conscious mental states require widespread brain activation or can at least some be fairly localized in narrower areas of the brain? Perhaps most interesting is whether or not the prefrontal cortex (PFC) is required for having conscious states (Gennaro 2012, chapter nine). Gennaro disagrees with Kriegel (2007, 2009 chapter seven) and Block (2007) that, according to the higher-order and self-representational view, the PFC is required for most conscious states (see also Del Cul et al. 2007, Lau and Rosenthal 2011). It may very well be that the PFC is required for the more sophisticated introspective states but this isn’t a problem for HOT theory as such because it does not require introspection for having first-order conscious states.

Are there conscious states without PFC activity? It seems so. For example, Rafael Malach and colleagues show that when subjects are engaged in a perceptual task or absorbed in watching a movie, there is widespread neural activation but little PFC activity (Grill-Spector and Malach 2004, Goldberg, Harel, and Malach 2006). Although some other studies do show PFC activation, this is mainly because of the need for subjects to report their experiences. Also, basic conscious experience is certainly not entirely eliminated even when there is extensive bilateral PFC damage or lobotomies (Pollen 2008). Zeki (2007) also cites evidence that the frontal cortex is engaged only when reportability is part of the conscious experience and that all human color imaging experiments have been unanimous in not showing any particular activation of the frontal lobes. Similar results are found for other sensory modalities, for example, in auditory perception (Baars and Gage 2010, chapter seven). Although areas outside the auditory cortex are sometimes cited, there is virtually no mention of the PFC.

Gennaro thinks that the above line of argument actually works to the advantage of HOT theory with regard to the problem of animal and infant consciousness. If HOT theory does not require PFC activity for all conscious states, then HOT theory is in even a better position to account for animal and infant consciousness since it is doubtful that they have the requisite PFC activity.

But why think that unconscious HOTs can occur outside the PFC? If we grant that unconscious HOTs can be regarded as a kind of “pre-reflective” self-consciousness, then one might for example look to Newen and Vogeley (2003) for answers. They distinguish five levels of self- consciousness ranging from “phenomenal self-acquaintance” and “conceptual self-consciousness” up to “iterative meta- representational self-consciousness.” They are explicitly concerned with the neural correlates of what they call the “first-person perspective” (1PP) and the “egocentric reference frame.” Citing numerous experiments, they point to various neural signatures of self-consciousness. The PFC is rarely mentioned and then usually only with regard to more sophisticated forms of self-consciousness. Other brain areas are much more prominently identified, such as the medial and inferior parietal cortices, the temporoparietal cortex, the posterior cingulate cortex, and the anterior cingulate cortex (ACC). Kriegel (2007) also mentions the ACC as a possible location for HOTs but it should be noted that the ACC is, at least sometimes, considered to be part of the PFC.

Damasio (1999) explicitly mentions the ACC as a site for some higher-order mental activity or “maps.” There are various cortical association areas that might be good candidates for HOTs depending on the modality. For example, key regions for spatial navigation comprise the medial parietal and right inferior parietal cortex, posterior cingulate cortex, and the hippocampus. Even when considering the neural signatures of theory of mind and mind-reading, Newen and Vogeley have replicated experiments indicating that such meta-representation is best located in the ACC. In addition, “the capacity for taking 1PP in such [theory of mind] contexts showed differential activation in the right temporo-parietal junction and the medial aspects of the superior parietal lobe” (Newen and Vogeley 2003, 538). Once again, even if the PFC is essential for having certain HOTs and conscious states, this poses no threat to HOT theory provided that the HOTs in question are of the more sophisticated introspective variety.

This matter is certainly not yet settled but Gennaro urges that it is a mistake, both philosophically and neurophysiologically, to claim that HOT theory should treat first-order conscious states as essentially including PFC activity. Further, and to tie this together with the animals issue, Gennaro concedes the following: “If all HOTs occur in the PFC, and if PFC activity is necessary for all conscious experience, and if there is little or no PFC activity in infants and most animals, then either (a) infants and most animals do not have conscious experience or (b) HOT theory is false” (Gennaro 2012, 281). Carruthers (2000, 2005) and perhaps Rosenthal opt for (b). Still, Gennaro argues that a good case can be made for the falsity of one or more of the conjuncts in the antecedent of the above conditional.

Kuzuch (2014) presents a very nice discussion of the PFC in relation to higher-order theories, arguing that the lack of dramatic deficits in visual consciousness even with PFC lesions presents a compelling case against higher-order theories. For example, in addition to the studies cited above, Kozuch references Alvarez and Emory (2006) as evidence for the view that

Lesions to the orbital, lateral, or medial PFC produce so-called executive dysfunction. Depending on the precise lesion location, subjects with damage to one of these areas have problems inhibiting inappropriate actions, switching efficiently from task to task, or retaining items in short-term memory. However, lesions to these areas appear not to produce notable deficits in visual consciousness: Tests of the perceptual abilities of subjects with lesions to the PFC proper reveal no such deficits; as well, PFC patients never report their visual experience to have changed in some remarkable way (Kozuch 2014, 729).

Kozuch notes that Gennaro’s WIV may be left undamaged, at least to some extent, since he does not require that the PFC is where HOTs are realized. It is also important to keep in mind the distinction between unconscious HOTs and conscious HOTs (= introspection). Perhaps the latter require PFC activity given the more sophisticated executive functions associated with introspection but having first-order conscious states does not require introspection. Yet another interesting argument along these lines is put forth by Sebastian (2014) with respect to some dream states. If some dreams are conscious states and there is little, if any, PFC activity during the dream period, then HOT theory would again be in trouble if we suppose that HOTs are realized in the PFC.

In conclusion, higher-order theory has remained a viable theory of consciousness, especially for those attracted to a reductionist account but not presently to a reduction in purely neurophysiological terms. Although there are significant objections to different versions of HOR, at least some plausible replies have emerged through the years. HOR also maintains a degree of intuitive plausibility due to the Transitivity Principle (TP). In addition, HOT theory might help to shed light on conceptualism and can contribute to the question of the PFC’s role in producing conscious states.

7. References and Further Reading

  • Alvarez, J. and Emory, E. 2006. Executive Function and the Frontal Lobes: A Meta-Analytic Review. Neuropsychology Review 16: 17-42.
  • Armstrong, D. 1981. What is Consciousness? In The Nature of Mind. Ithaca, NY: Cornell University Press.
  • Baars, B. and Gage, N. 2010. Cognition, Brain, and Consciousness: Introduction to Cognitive Neuroscience. Second Edition. Oxford: Elsevier.
  • Bayne, T. and Montague, M. eds. 2011. Cognitive Phenomenology. New York: Oxford University Press.
  • Billon, A. and Kriegel, U. 2015. Jaspers’ Dilemma: The Psychopathological Challenge to Subjectivity Theories of Consciousness. In R. Gennaro ed. Disturbed Consciousness. Cambridge, MA: MIT Press.
  • Block, N. 1996. Mental Paint and Mental Latex. In E. Villanueva ed. Perception. Atascadero, CA: Ridgeview.
  • Block, N. 2007. Consciousness, Accessibility, and the Mesh between Psychology and Neuroscience. Behavioral and Brain Sciences 30: 481-499.
  • Block, N. 2011. The Higher-Order Approach to Consciousness is Defunct. Analysis 71: 419-431.
  • Bottini, G., Bisiach, E., Sterzi, R., and Vallar, G. 2002. Feeling Touches in Someone Else’s Hand. NeuroReport 13: 249-252.
  • Bortolotti, L. 2009. Delusions and Other Irrational Beliefs. New York: Oxford University Press.
  • Brentano, F. 1874/1973. Psychology From an Empirical Standpoint. New York: Humanities.
  • Byrne, A. 1997. Some like it HOT: Consciousness and Higher-Order Thoughts. Philosophical Studies 86: 103-129.
  • Byrne, A. 2001. Intentionalism Defended. Philosophical Review 110: 199-240.
  • Carruthers, P. 1989. Brute Experience. Journal of Philosophy 86: 258-269.
  • Carruthers, P. 2000. Phenomenal Consciousness. Cambridge: Cambridge University Press.
  • Carruthers, P. 2004. HOP over FOR, HOT Theory. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Carruthers, P. 2005. Consciousness: Essays from a Higher-Order Perspective. New York: Oxford University Press.
  • Carruthers, P. 2008. Meta-Cognition in Animals: A Skeptical Look. Mind and Language 23: 58-89.
  • Carruthers, P. 2009. How we Know our Own Minds: The Relationship Between Mindreading and Metacognition. Behavioral and Brain Sciences 32: 121-138.
  • Chalmers, D. 1995. Facing Up to the Problem of Consciousness. Journal of Consciousness Studies 2: 200-219.
  • Chalmers, D. 1996. The Conscious Mind. New York: Oxford University Press.
  • Chalmers, D. 2004. The Representational Character of Experience. In B. Leiter ed. The Future for Philosophy. Oxford: Oxford University Press.
  • Chuard, P. 2007. The Riches of Experience. In R. Gennaro ed. The Interplay between Consciousness and Concepts. Exeter: Imprint Academic.
  • Chundoff, E. 2015. Cognitive Phenomenology. New York: Routledge.
  • Clayton, N., Bussey, T., and Dickinson, A. 2003. Can Animals Recall the Past and Plan for the Future? Nature Reviews Neuroscience 4: 685-691.
  • Clayton, N., Emery, N., and Dickinson, A. 2006. The Rationality of Animal Memory: Complex Caching Strategies of Western Scrub Jays. In Hurley and Nudds 2006.
  • Coleman, S. 2015. Quotational Higher-Order Thought Theory. Philosophical Studies 172: 2705-2733.
  • Damasio, A. 1999. The Feeling of What Happens. New York: Harcourt Brace and Co.
  • Del Cul, A., Baillet, S., and Dehaene, S. 2007. Brain Dynamics Underlying the Nonlinear Threshold for Access to Consciousness. PLoS Biology 5: 2408-2423.
  • Dretske, F. 1995. Naturalizing the Mind. Cambridge, MA: MIT Press.
  • Droege, P. 2003. Caging the Beast. Philadelphia and Amsterdam: John Benjamins Publishers.
  • Emery, N. and Clayton, N. 2001. Effects of Experience and Social Context on Prospective Caching Strategies in Scrub Jays. Nature 414: 443-446.
  • Farah, M. 2004. Visual Agnosia, 2nd ed. Cambridge, MA: MIT Press.
  • Flombaum, J. and Santos, L. 2005. Rhesus Monkeys Attribute Perceptions to Others. Current Biology 15: 447-452.
  • Gennaro, R. 1993. Brute Experience and the Higher-Order Thought Theory of Consciousness. Philosophical Papers 22: 51-69.
  • Gennaro, R. 1996. Consciousness and Self-consciousness: A Defense of the Higher-Order Thought Theory of Consciousness. Amsterdam and Philadelphia: John Benjamins.
  • Gennaro, R. 2002. Jean-Paul Sartre and the HOT Theory of Consciousness. Canadian Journal of Philosophy 32: 293-330.
  • Gennaro, R. ed. 2004a. Higher-Order Theories of Consciousness: An Anthology. Amsterdam and Philadelphia: John Benjamins.
  • Gennaro, R. 2004b. Higher-Order Thoughts, Animal Consciousness, and Misrepresentation: A Reply to Carruthers and Levine. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Gennaro, R. 2005. The HOT Theory of Consciousness: Between a Rock and a Hard Place? Journal of Consciousness Studies 12 (2): 3-21.
  • Gennaro, R. 2006. Between Pure Self-Referentialism and the (extrinsic) HOT Theory of Consciousness. In U. Kriegel and K. Williford eds. Self-Representational Approaches to Consciousness. Cambridge, MA: MIT Press.
  • Gennaro, R. 2008. Representationalism, Peripheral Awareness, and the Transparency of Experience. Philosophical Studies 139: 39-56.
  • Gennaro, R. 2009. Animals, consciousness, and I-thoughts. In R. Lurz ed. Philosophy of Animal Minds. New York: Cambridge University Press.
  • Gennaro, R. 2012. The Consciousness Paradox: Consciousness, Concepts, and Higher-Order Thoughts. Cambridge, MA: The MIT Press.
  • Gennaro, R. 2013. Defending HOT Theory and the Wide Intrinsicality View: A reply to Weisberg, Van Gulick, and Seager. Journal of Consciousness Studies 20 (11-12): 82-100.
  • Gennaro, R. 2015a. The ‘of’ of Intentionality and the ‘of’ of Acquaintance. In S. Miguens, G. Preyer, and C. Morando eds. Pre-Reflective Consciousness: Sartre and Contemporary Philosophy of Mind. New York: Routledge Publishers.
  • Gennaro, R. 2015b. Somatoparaphrenia, Anosognosia, and Higher-Order Thoughts. In R. Gennaro ed. Disturbed Consciousness. Cambridge, MA: MIT Press.
  • Gennaro, R. ed. 2015c. Disturbed Consciousness: New Essays on Psychopathology and Theories of Consciousness. Cambridge, MA: The MIT Press.
  • Goldberg, I., Harel, M., and Malach, R. 2006. When the Brain Loses its Self: Prefrontal
  • Inactivation during Sensorimotor Processing. Neuron 50: 329-339.
  • Goldman, A. 1993. Consciousness, Folk Psychology and Cognitive Science. Consciousness and Cognition 2: 264-82.
  • Goldman, A. 2006. Simulating Minds. New York: Oxford University Press.
  • Grill-Spector, K. and Malach, R. 2004. The Human Visual Cortex. Annual Review of Neuroscience 7: 649-677.
  • Gunther, Y. ed. 2003. Essays on Nonconceptual Content. Cambridge, MA: MIT Press.
  • Harman, G. 1990. The Intrinsic Quality of Experience. In J. Tomberlin ed. Philosophical Perspectives, 4. Atascadero, CA: Ridgeview Publishing.
  • Horgan, T. and Tienson, J. 2002. The Intentionality of Phenomenology and the Phenomenology of Intentionality. In D. Chalmers ed. Philosophy of Mind: Classical and Contemporary Readings. New York: Oxford University Press.
  • Hurley, S. and Nudds, M. eds. 2006. Rational Animals? New York: Oxford University Press.
  • Janzen, G. 2008. The Reflexive Nature of Consciousness. Amsterdam and Philadelphia: John Benjamins.
  • Jehle, D. and Kriegel, U. 2006. An Argument against Dispositional HOT Theory. Philosophical Psychology 19: 462-476.
  • Kant, I. 1781/1965. Critique of Pure Reason. Translated by N. Kemp Smith. New York: MacMillan.
  • Kozuch, B. 2014. Prefrontal Lesion Evidence against Higher-Order Theories of Consciousness. Philosophical Studies 167: 721-746.
  • Kriegel, U. 2002. PANIC Theory and the Prospects for a Representational Theory of Phenomenal Consciousness. Philosophical Psychology 15: 55-64.
  • Kriegel, U. 2003. Consciousness as Intransitive Self-Consciousness: Two Views and an Argument. Canadian Journal of Philosophy 33: 103-132.
  • Kriegel, U. 2005. Naturalizing Subjective Character. Philosophy and Phenomenological Research 71: 23-56.
  • Kriegel, U. 2006. The Same Order Monitoring Theory of Consciousness. In U. Kriegel and K. Williford eds. Self-Representational Approaches to Consciousness. Cambridge, MA: MIT Press.
  • Kriegel, U. 2007. A Cross-Order Integration Hypothesis for the Neural Correlate of Consciousness. Consciousness and Cognition 16: 897-912.
  • Kriegel, U. 2009. Subjective Consciousness. New York: Oxford University Press.
  • Kriegel, U. ed. 2013. Phenomenal Intentionality. New York: Oxford University Press.
  • Kriegel, U. and Williford, K. eds. 2006. Self-Representational Approaches to Consciousness. Cambridge, MA: MIT Press.
  • Lane, T. 2015. Self, Belonging, and Conscious Experience: A Critique of Subjectivity Theories of Consciousness. In R. Gennaro ed. Disturbed Consciousness. Cambridge, MA: MIT Press.
  • Lane, T. and Liang, C. 2010. Mental Ownership and Higher-Order Thought. Analysis 70: 496-501.
  • Lau, H. and Rosenthal, D. 2011. Empirical Support for Higher-Order Theories of Conscious Awareness. Trends in Cognitive Sciences 15: 365–373.
  • Levine, J. 2001. Purple Haze: The Puzzle of Conscious Experience. Cambridge, MA: MIT Press.
  • Liang, L. and Lane, T. 2009. Higher-Order Thought and Pathological Self: The Case of Somatoparaphrenia. Analysis 69: 661-668.
  • Lurz, R. ed. 2009. The Philosophy of Animal Minds. Cambridge, MA: Cambridge University Press.
  • Lurz, R. 2011. Mindreading Animals. Cambridge, MA: MIT Press.
  • Lycan, W. 1996. Consciousness and Experience. Cambridge, MA: MIT Press.
  • Lycan, W. 2001. A Simple Argument for a Higher-Order Representation Theory of Consciousness. Analysis 61: 3-4.
  • Lycan, W. 2004. The Superiority of HOP to HOT. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Nagel, T. 1974. What is it Like to be a Bat? Philosophical Review 83: 435-456.
  • Neander, K. 1998. The Division of Phenomenal Labor: A Problem for Representational Theories of Consciousness. Philosophical Perspectives 12: 411-434.
  • Newen, A. and Vogeley, K. 2003. Self-Representation: Searching for a Neural Signature of Self-Consciousness. Consciousness and Cognition 12: 529-543.
  • Nichols, S. and Stich, S. 2003. Mindreading. New York: Oxford University Press.
  • Picciuto, V. 2011. Addressing Higher-Order Misrepresentation with Quotational Thought. Journal of Consciousness Studies 18 (3-4): 109-136.
  • Pollen, D. 2008. Fundamental Requirements for Primary Visual Perception. Cerebral Cortex 18: 1991-1998.
  • Prinz, J. 2012. The Conscious Brain. New York: Oxford University Press.
  • Radden, J. 2010. On Delusion. Abingdon and New York: Routledge.
  • Rolls, E. 2004. A Higher Order Syntactic Thought (HOST) Theory of Consciousness. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Rosenthal, D.M. 1986. Two Concepts of Consciousness. Philosophical Studies 49: 329-359.
  • Rosenthal, D.M. 1991. The Independence of Consciousness and Sensory Quality. Philosophical Issues 1: 15-36.
  • Rosenthal, D.M. 1997. A Theory of Consciousness. In N. Block, O. Flanagan, and G. Güzeldere eds. The Nature of Consciousness. Cambridge, MA: MIT Press.
  • Rosenthal, D.M. 2002. Explaining Consciousness. In D. Chalmers ed. Philosophy of Mind: Classical and Contemporary Readings. New York: Oxford University Press.
  • Rosenthal, D.M. 2004. Varieties of Higher-Order Theory. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Philadelphia and Amsterdam: John Benjamins.
  • Rosenthal, D.M. 2005. Consciousness and Mind. New York: Oxford University Press.
  • Rosenthal, D.M. 2010. Consciousness, the Self and Bodily Location. Analysis 70: 270-276.
  • Rosenthal, D.M. 2011. Exaggerated Reports: Reply to Block. Analysis 71: 431-437.
  • Santos, L., Nissen, A., and Ferrugia, J. 2006. Rhesus monkeys, Macaca mulatta, Know
  • What Others Can and Cannot Hear. Animal Behaviour 71: 1175-1181.
  • Sartre, J. 1956. Being and Nothingness. New York: Philosophical Library.
  • Sauret, W. and Lycan, W. 2014. Attention and Internal Monitoring: A Farewell to HOP. Analysis 74: 363-370.
  • Seager, W. 2004. A Cold Look at HOT Theory. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Searle, J. 1992. The Rediscovery of the Mind. Cambridge. MA: MIT Press.
  • Sebastián, M. 2013. Not a HOT Dream. In R. Brown ed. Consciousness Inside and Out: Phenomenology, Neuroscience, and the Nature of Experience. Dordrecht: Springer.
  • Sierra, M. and Berrios, G. 2000. The Cambridge Depersonalisation Scale: a New Instrument for the Measurement of Depersonalisation. Psychiatry Research 93: 153-164.
  • Siewart, C. 1998. The Significance of Consciousness. Princeton: Princeton University Press.
  • Smith, D.W. 2004. Mind World: Essays in Phenomenology and Ontology. Cambridge, MA: Cambridge University Press.
  • Terrace, H. and Metcalfe, J. eds. 2005. The Missing Link in Cognition: Origins of Self-Reflective Consciousness. New York: Oxford University Press.
  • Tye, M. 1995. Ten Problems of Consciousness. Cambridge, MA: MIT Press.
  • Tye, M. 2000. Consciousness, Color, and Content. Cambridge, MA: MIT Press.
  • Vallar, G. and Ronchi, R. 2009. Somatoparaphrenia: A Body Delusion. A Review of the Neuropsychological Literature. Experimental Brain Research 192: 533-551.
  • Van Gulick, R. 1995. What Would Count as Explaining Consciousness? In T. Metzinger ed. Conscious Experience. Paderborn: Ferdinand Schöningh.
  • Van Gulick, R. 2000. Inward and Upward: Reflection, Introspection and Self-awareness. Philosophical Topics 28: 275-305.
  • Van Gulick, R. 2004. Higher-Order Global States (HOGS): An Alternative Higher-Order Model of Consciousness. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Van Gulick, R. 2006. Mirror Mirror—Is That All? In U. Kriegel and K. Williford eds. Self- Representational Approaches to Consciousness. Cambridge, MA: MIT Press.
  • Weisberg, J. 2008. Same Old, Same Old: The Same-Order Representation Theory of Consciousness and the Division of Phenomenal Labor. Synthese 160: 161-181.
  • Weisberg, J. 2011. Misrepresenting Consciousness. Philosophical Studies 154: 409-433.
  • Williford, K. 2006. The Self-Representational Structure of Consciousness. In Kriegel and Williford 2006.
  • Zahavi, D. 2004. Back to Brentano? Journal of Consciousness Studies 11 (10-11): 66-87.
  • Zahavi, D. 2007. The Heidelberg School and the Limits of Reflection. In S. Heinämaa, V. Lähteenmäki, and P. Remes eds. Consciousness: From perception to reflection in the history of philosophy. Dordrecht: Springer.
  • Zeki, S. 2007. A Theory of Micro-Consciousness. In M. Velmans and S. Schneider eds. The Blackwell Companion to Consciousness. Malden, MA: Blackwell.

 

Author Information

Rocco J. Gennaro
Email: rjgennaro@usi.edu
University of Southern Indiana
U. S. A.

Capital Punishment

Capital punishment, or “the death penalty,” is an institutionalized practice designed to result in deliberately executing persons in response to actual or supposed misconduct and following an authorized, rule-governed process to conclude that the person is responsible for violating norms that warrant execution.  Punitive executions have historically been imposed by diverse kinds of authorities, for an expansive range of conduct, for political or religious beliefs and practices, for a status beyond one’s control, or without employing any significant due process procedures.  Punitive executions also have been and continue to be carried out more informally, such as by terrorist groups, urban gangs, or mobs.  But for centuries in Europe and America, discussions have focused on capital punishment as an institutionalized, rule-governed practice of modern states and legal systems governing serious criminal conduct and procedures.

Capital punishment has existed for millennia, as evident from ancient law codes and Plato’s famous rendition of Socrates’s trial and execution by democratic Athens in 399 B.C.E.  Among major European philosophers, specific or systematic attention to the death penalty is the exception until about 400 years ago.  Most modern philosophic attention to capital punishment emerged from penal reform proponents, as principled, moral evaluation of law and social practice, or amidst theories of the modern state and sovereignty.  The mid-twentieth century emergence of an international human rights regime and American constitutional controversies sparked anew much philosophic focus on theories of punishment and the death penalty, including arbitrariness, mistakes, or discrimination in the American institution of capital punishment.

The central philosophic question about capital punishment is one of moral justification:  on what grounds, if any, is the state’s deliberate killing of identified offenders a morally justifiable response to voluntary criminal conduct, even the most serious of crimes, such as murder?  As with questions about the morality of punishment, two broadly different approaches are commonly distinguished: retributivism, with a focus on past conduct that merits death as a penal response, and utilitarianism or consequentialism, with attention to the effects of the death penalty, especially any effects in preventing more crime through deterrence or incapacitation.  Section One provides some historical context and basic concepts for locating the central philosophic question about capital punishment:  Is death the amount or kind of penalty that is morally justified for the most serious of crimes, such as murder?  Section Two attends to classic considerations of lex talionis (“the law of retaliation”) and recent retributivist approaches to capital punishment that involve the right to life or a conception of fairness.  Section Three considers classic utilitarian approaches to justifying the death penalty: primarily as preventer of crime through deterrence or incapacitation, but also with respect to some other consequences of capital punishment.  Section Four attends to relatively recent approaches to punishment as expression or communication of fundamental values or norms, including for purposes of educating or reforming offenders.  Section Five explores issues of justification related to the institution of capital punishment, as in America: Is the death penalty morally justifiable if imperfect procedures produce mistakes, caprice, or (racial) discrimination in determining who is to be executed? Or if the actual execution of capital punishment requires unethical conduct by medical practitioners or other necessary participants?  Section Six considers the moral grounds, if any exist, for the state’s authority to punish by death.

Table of Contents

  1. Context and Basic Concepts
    1. Historical Practices
    2. Philosophic Frameworks and Approaches
  2. Retributivist Approaches
    1. Classic Retributivism: Kant and lex talionis
    2. Lex talionis as a Principle of Proportionality
    3. Retributivism and the Right to Life
    4. Retributivism and Fairness
    5. Challenges to Retributivism
  3. Utilitarian Approaches
    1. Classic Utilitarian Approaches: Bentham, Beccaria, Mill
    2. Empirical Considerations: Incapacitation, Deterrence
    3. Utilitarian Defenses: “Common Sense” and “Best Bet”
    4. Challenges to Utilitarianism
    5. Other Consequential Considerations
  4. Capital Punishment as Communication
  5. The Institution of Capital Punishment
    1. Procedural Issues: Imperfect Justice
    2. Discrimination: Race, Class
    3. Medicine and the Death Penalty
    4. Costs: Economic Issues
  6. State Authority and Capital Punishment
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Context and Basic Concepts

a. Historical Practices

Much philosophic focus on the death penalty is modern and relatively recent.  The phrase ‘capital punishment’ is older, used for nearly a millennium to signify the death penalty.  The classical Latin and medieval French roots of the term ‘capital’ indicate a punishment involving the loss of head or life, perhaps reflecting the use of beheading as a form of execution.  The actual practice of capital punishment is ancient, emerging much earlier than the familiar terms long used to refer to it.  In the ancient world, the Babylonian Code of Hammurabi (circa 1750 B.C.E.) included about 25 capital crimes; the Mosaic Code of the ancient Hebrews identifies numerous crimes punishable by death, invoking, like other ancient law codes, lex talionis, “the law of retaliation”; Draco’s Code of 621 B.C.E. Athens punished most crimes by death, and later Athenian law famously licensed the trial and death of Socrates; the fifth century B.C.E. Twelve Tables of Roman law include capital punishment for such crimes as publishing insulting songs or disturbing the nocturnal peace of urban areas, and later Roman law famously permitted the crucifixion of Jesus of Nazareth.  Even in such early practices, capital punishment was seen as within the authority of political rulers, embodied as a legal institution, and employed for a wide range of misconduct proscribed by law.

Medieval and early modern Europe retained expansive lists of capital crimes and notably expanded the forms of execution beyond the common ancient practices of stoning, crucifixion, drowning, beating to death, or poisoning.  In the Middle Ages both secular and ecclesiastical authorities participated in executions deliberately designed to be torturous and brutal, such as beheading, burning alive, drawing and quartering, hanging, disemboweling, using the rack, using thumb-screws, pressing with weights, boiling in oil, publicly dissecting, and castrating.  Such brutality was conducted publicly as spectacle and ritual­—an important or even essential element of capital punishment was not only the death of the accused, but the public process of killing and dying on display.  Capital punishment was varied in its severity by the spectrum of torturous ways by which the offender’s death was eventually effected by political and other penal authorities.

In “the new world” the American colonies’ use of the death penalty was influenced more by Britain than by any other nation.  The “Bloody Code” of the Elizabethan era included over 200 capital crimes, and the American colonies followed England in using public, ritualized hangings as the common form of execution.  Until the mid-18th century, the colonies employed elaborate variations of the ritual of execution by hanging, even to the point of holding fake hangings.  Stuart Banner summarizes the early American practices:

Capital punishment was more than just one penal technique among others. It was the base point from which all other kinds of punishment deviated.  When the state punished serious crime, most of the methods …were variations on execution.  Officials imposed death sentences that were never carried out, they conducted mock hangings…, and they dramatically halted real execution ceremonies at the last minute.  These were methods of inflicting a symbolic death …. Officials also wielded a set of tools capable of intensifying a death sentence – burning at the stake, public display of the corpse, dismemberment and dissection – ways of producing a punishment worse than death. (54)

In early America “capital punishment was not just a single penalty,” but “a spectrum of penalties with gradations of severity above and below an ordinary execution” (Banner, 86).

The late 18th century brought a “dramatic transformation of penal thought and practice” that was international in scope (Banner, 89). The dramatic change came with the birth of publicly supported prisons or penitentiaries that allowed extended incarceration for large numbers of people (Banner, 99).  Before prisons and the practical possibility of lengthy incarceration as an alternative, “the only available units of measurement for serious crime were degrees of deviation from an ordinary execution” (Banner, 70).  After the invention of prisons, for serious crimes there was now an alternative to capital punishment and to the practiced spectrum of torturous executions: prisons allowed varying conditions of confinement (for example, hard labor, solitary confinement, loss of privacy) and a temporal measure, at least, for distinguishing degrees of punishment to address kinds of serious misconduct.  Dramatic changes for capital punishment also came with the 1864 publication in Italy of Cesare Beccaria’s essay, “On Crimes and Punishments.”  Very influential in Europe and the United States, Beccaria’s sustained, philosophic investigation of the death penalty challenged both the authority of the state to punish by death and the utility of capital punishment as a superior deterrent to lengthy imprisonment.  Philosophic defenses of the death penalty, like that of Immanuel Kant, opposed reformers and others, who, like Beccaria, argued for abolition of capital punishment.  During the 19th century the methods of execution were made less brutal and the number of capital crimes was much reduced compared to earlier centuries of practice.  Discussions of the death penalty’s merits invoked divergent understandings of the aims of punishment in general and thus of capital punishment in particular.

By the mid-20th century, two developments prompted another period of focused philosophic attention to the death penalty.  In the United States a series of Supreme Court cases challenged whether the death penalty falls under the constitutional prohibition of “cruel and unusual punishments,” including questions about the legal and moral import of a criminal justice process that results in mistakes, caprice, or racial discrimination in capital cases.   Capital punishment also became a global concern with the post-World War II Nuremberg trials of Nazi leaders and after the 1948 Declaration of Universal Human Rights and subsequent human rights treaties explicitly accorded all persons a right to life and encouraged abolishing the death penalty worldwide.  Most nations have now abolished capital punishment, with notable exceptions including China, North Korea, Japan, India, Indonesia, Egypt, Somalia, and the United States, the only western “industrialized” nation still retaining the death penalty.

b. Philosophic Frameworks and Approaches

Capital punishment is often explored philosophically in the context of more general theories of “the standard or central case” of punishment as an institution or practice within a structure of legal rules (Hart, “Prolegomenon,” 3-5).  The philosopher’s interest in the death penalty, then, is embedded in broader issues about the moral permissibility of punishment.  Any punishment – and certainly an execution – intentionally inflicts on a person significant pain, suffering, unpleasantness, or deprivation that it is ordinarily wrong for an authority like the state to impose.  What conditions or considerations, if any, would morally justify such penal practices?  Following a framework famously offered by H.L.A. Hart,

[w]hat we should look for are answers to a number of different questions such as:  What justifies the general practice of punishment? To whom may punishment be applied? How severely may we punish? (“Prolegomenon,” 3)

These different questions are, respectively, about the general justifying aim of punishment, about the conditions of responsibility for criminal conduct and liability to punishment, and about the amount, kind, or form of punishment justifiable to address actual or supposed misconduct.  It is the last of these questions of justification – about the justified amount, kind, or form of punishment – that is foremost in philosophic approaches to the death penalty.  Almost all modern and recent discussions of capital punishment assume liability for the death penalty is only for the gravest of crimes, such as murder; almost all assume comparatively humane modes of execution and largely ignore considering obviously torturous or brutal killings of offenders; and it is assumed that some amount of punishment is merited for murderers.  The central question, then, is not often whether punishing murderers is morally justifiable (rather than rehabilitation or release, for example), but whether it is morally justifiable to punish by death (rather than by imprisonment, for example) those found to have committed a grave offense, such as murder.  Responses to this question about the death penalty often build on more general principles or theories about the purposes of punishment in general, and about general criteria for determining the proper measure or amount of punishment for various crimes.

Among philosophers there are typically identified two broadly different ways of thinking about the moral merits of punishment in general, and whether capital punishment is a proper amount of punishment to address serious criminal misconduct (see “Punishment”). Justifications are proposed either with reference to forward-looking considerations, such as various future effects or consequences of capital punishment, or with reference to backward-looking considerations, such as facets of the wrongdoing to be punished.   The latter approach, if dominant, has, since the 1930s, been called ‘retributivism’; retributivist justifications “look back” to the offense committed in order to link directly the amount, kind, or form of punishment to what the offense merits as penal response.  This linkage is often characterized as whether a punishment “fits” the crime committed.  For retributivists, any beneficial effects or consequences of capital punishment are wholly irrelevant or distinctly secondary.  Forward-looking justifications of punishment have been labeled ‘utilitarian’ since the 19th century and, since the mid-20th century, other versions are sometimes called ‘consequentialism’. Consequentialist or utilitarian approaches to the death penalty are distinguished from retributivist approaches because the former rely only on assessing the future effects or consequences of capital punishment, such as crime prevention through deterrence and incapacitation.

2. Retributivist Approaches

Retributivists approach justifying the amount of punishment for misconduct by “looking back” to aspects of the wrongdoing committed.  There are many different versions of retributivism; all maintain a tight, essential link between the offense voluntarily committed and the amount, form, or kind of punishment justifiably threatened or imposed.  Future effects or consequences, if any, are then irrelevant or distinctly secondary considerations to justifying punishments for misconduct, including the death penalty.  Retributivism about capital punishment often prominently appeals to the principle of lex talionis, or “the law of retaliation,” an idea popularly familiarized in the ancient and biblical phrase, “an eye for an eye and a tooth for a tooth.”  Forms of retributivism vary according to their interpretation of lex talionis or in their appealing to alternative moral notions, such as basic moral rights or a principle of fairness.

a. Classic Retributivism: Kant and lex talionis

 A classic expression of retributivism about capital punishment can be found in a late 18th century treatise by Immanuel Kant, The Metaphysical Elements of Justice (99-107; Ak. 331-337).  After dismissing Cesare Beccaria’s abolitionist stance and reliance on “sympathetic sentimentality and an affectation of humanitarianism,” Kant appeals to an interpretation of lex talionis, what he calls “jus talionis” or “the Law of Retribution,” as justifying capital punishment:

Judicial punishment… must in all cases be imposed on him only on the ground that he committed a crime.… He must first be found deserving of punishment… The law concerning punishment is a categorical imperative. (100; Ak. 331) What kind and degree of punishment does public legal justice adopt as its principle and standard?  None other than the principle of equality….  Only the Law of Retribution (jus talionis) can determine exactly the kind and degree of punishment (101; Ak. 332).

Kant then explicitly applies these principles to determine the punishment for the most serious of crimes:

 If… he has committed a murder, he must die.  In this case, there is no substitute that will satisfy the requirements of legal justice. There is no sameness of kind between death and remaining alive even under the most miserable conditions, and consequently there is also no equality between the crime and retribution unless the criminal is judicially condemned and put to death (102; Ak. 333).

Kant then employs a hypothetical case to insist that any social effects of the death penalty, good or bad, are wholly irrelevant to its justification:

Even if a civil society were to dissolve… the last murderer in prison would first have to be executed so that each should receive his just deserts and that the people should not bear the guilt of a capital crime… [and] be regarded as accomplices in the public violation of justice (102; Ak. 333).

So, even if social effects are not possible, since the society no longer exists, the death penalty is justified for murder.  Kant exemplifies a pure retributivism about capital punishment: murderers must die for their offense, social consequences are wholly irrelevant, and the basis for linking the death penalty to the crime is “the Law of Retribution,” the ancient maxim, lex talionis, rooted in “the principle of equality.”

The key to Kant’s defense of capital punishment is “the principle of equality,” by which the proper, merited amount and kind of punishment is determined for crimes.  Whether the best interpretation of Kant or not, the idea behind this common approach seems to be that offenders must suffer a punishment equal to the victim’s suffering: “an eye for an eye, a tooth for a tooth,” a life for a life.  But as often noted, any literalism about lex talionis cannot work as a general principle linking crimes and punishments. It seems to imply that the merited punishment for rape is to be raped, for robbery to be stolen from, for fraud to be defrauded, for assault to be assaulted, for arson to be “burned out,” etc.  For other crimes—forgery, drug peddling, serial killings or massacres, terrorism, genocide, smuggling—it is not at all clear what kind or form of punishment lex talionis would then license or require (for example, Nathanson 72-75).  As C. L. Ten succinctly says, “it would appear that the single murder is one of the few cases in which the lex talionis can be applied literally” (151).  Both practical considerations and moral principles about permissible forms of punishment, then, ground objections to invoking a literal interpretation of lex talionis to justify capital punishment for murder.

Some retributivists employ a less literal way of employing a principle of equality to justify death as the punishment for murder.  The relevant equivalence is one of harms caused and suffered:  the murder victim suffers the harm of a life ended, and the only equivalent harm to be imposed as punishment, then, must be the death of the killer.  As a general way of linking kinds of misconduct and proper amounts, kinds, or forms of punishment, this rendition of lex talionis also faces challenges (Ten, 151-154).  Furthermore, it is also often noted that, even in the case of murder, there is no equivalence between the penal experience of capital offenders and their victims’ suffering in being murdered.  Albert Camus, in his “Reflections on the Guillotine,” makes the point in a rather dramatic way:

But what is capital punishment if not the most premeditated of murders, to which no criminal act, no matter how calculated, can be compared?  If there were to be a real equivalence, the death penalty would have to be pronounced upon a criminal who had forewarned his victim of the very moment he would put him to a horrible death, and who, from that time on, had kept him confined at his own discretion for a period of months.  It is not in private life that one meets such monsters.  (199)

This inequality of experience claim is even more to the point since even Kant maintains that “the death of the criminal must be kept entirely free of any maltreatment that would make an abomination of the humanity residing in the person suffering it” (102; Ak. 333).

b. Lex talionis as a Principle of Proportionality

Most contemporary retributivists interpret lex talionis not as expressing equality of crimes and punishments, but as expressing a principle of proportionality for establishing the merited penal response to a crime such as murder.  The idea is that the amount of punishment merited is to be proportional to the seriousness of the offense, more serious offenses being punished more severely than less serious crimes.  So, one constructs an ordinal ranking of crimes according to their seriousness and then constructs a corresponding ranking of punishments according to their severity.  The least serious crime is then properly punished by the least severe penalty, the second least serious crime by the second least severe punishment, and so on.  The gravest misconduct, then, is properly addressed by the most severe of punishments, death.

To carry out such a general project of constructing scales of crimes and matching punishments is a daunting challenge, as even many retributivists admit.  Aside from these concerns, as a defense of capital punishment this approach to lex talionis simply raises the question about the morality of the death penalty, even for the most serious of crimes.   There is no reason to think that current capital punishment practices are the most severe punishment.  Consider medieval practices of death with torture, or death “with extreme prejudice”; and are there not possible conditions of confinement that are possibly more severe than execution, such as years of brutal, solitary confinement or excessively hard labor?  Such punishments would not likely now be on a list of morally permissible penal responses to even the most serious crimes.  But then what is needed is some justification for setting an upper bound of morally permissible severity for punishments, “a theory of permissibility” (Finkelstein, “A Contractarian Approach…,” 212-213).  But whether today’s death penalty is morally permissible is precisely the question at issue.  The retributivist proportionality interpretation of lex talionis simply assumes capital punishment is morally permissible, rather than offering a defense of it.

One general concern about appeals to lex talionis, under any interpretation, is that relying on “the law of retaliation” can appear to make capital punishment tantamount to justified vengeance.  But Kant and other retributivist defenders of the death penalty rightly distinguish principled retribution from vengeance.   Vengeance arises out of someone’s hatred, anger, or desires typically aimed at another:  there is no internal limit to the severity of the response, except perhaps that which flows from the personal perspective of the avenger.  The avenger’s response may be markedly disproportionate to the offense committed, whereas retributivists insist that the severity of punishments must be matched to the misconduct’s gravity.  Vengeance is typically personal, directed at someone about whom the avenger cares—it is personal.  Retribution requires responses even to injuries of people no one cares about:  its impersonality makes harms to the friendless as weighty as harms to the popular and justifies punishment without regard to whether anyone desires the offender suffer.  The avenger typically takes pleasure in the suffering of the offender, whereas “we may all deeply regret having to carry out the punishment” (Pojman, 23) or only take “pleasure at justice being done” (Nozick, 367) as a retributivist moral principle requires.  Even if desires for vengeance are satisfied by executing murderers, for retributivists such effects are not at the heart of the defense of capital punishment.  And to the extent that such satisfactions are sufficient justification, then the defense is no longer retributivist, but utilitarian or consequentialist (see sections 3 and 4).  For retributivists the morality of the death penalty for murder is a matter of general moral principle, not assuaging any desires for revenge or vengeance on the part of victims or others.

c. Retributivism and the Right to Life

Some forms of retributivism about capital punishment eschew reliance on lex talionis in favor of other kinds of moral principles, and they typically depart from Kant’s conclusion that murderers must be punished by death, regardless of any consequences.  One approach employs the idea of basic moral rights, such as the right to life, an expression of the value of life that seems to work against justifying capital punishment.   Yet John Locke, for example, in his Second Treatise on Government, posits both a natural right to life and defends the death penalty for murderers.  Echoing a line of reasoning exhibited in Thomas Aquinas’s defense of capital punishment (Summa Theologiae II-II, Q. 64, a.2), Locke claims that a murderer violates another’s right to life, and thereby “declares himself… to be a noxious creature… and therefore may be destroyed as a lion or a tiger, one of those wild savage beasts… both to deter others from doing the like injury… and also to secure men from the attempts of a criminal” (Treatise, sections 10-11).  For Locke, murderers have, by their voluntary wrongdoing, forfeited their own right to life and can therefore be treated as a being not possessing any right to life at all and as subject to execution to effect some good for society.

This retributivist position notably departs from Kant’s extreme view in concluding only that a murderer may be put to death, not must be, and by invoking utilitarian thinking as a secondary consideration in deciding whether capital punishment is morally justified for murderers who have forfeited their right to life.  This form of retributivism—rights forfeiture and considering consequences of the death penalty—is also explicitly expressed by W. D. Ross in his 1930 book, The Right and the Good:

But to hold that the state has no duty of retributive punishment is not necessarily to adopt a utilitarian view of punishment.… [T]he main element in any one’s right to life or property is extinguished by his failure to respect the corresponding right in others.… [T]he offender, by violating the life or liberty or property of another, has lost his own right to have his life, liberty, or property respected, so that the state has no prima facie duty to spare him as it has a prima facie duty to spare the innocent.  It is morally at liberty to injure him as he has injured others, or to inflict any lesser injury on him, or to spare him, exactly as consideration of both of the good of the community and of his own good requires. (60-61)

The retributivist argument, then, is that murderers forfeit their own right to life by virtue of voluntarily taking another’s life.  Since a right to life, like other rights, logically entails a correlative duty of others (see Consequentialism and Ethics, section 2b), by forfeiting their right to life murderers eliminate the state’s correlative duty not to kill them; the murderer’s forfeiture makes morally permissible the state’s putting them to death, at least as a means to some good.  Thus, capital punishment is not a violation of an offender’s right to life, as the offender has forfeited that right, and the death penalty is then justifiable as a morally permissible way to treat murderers in order to effect some good for society.

This kind of retributivist approach to capital punishment raises philosophic issues, aside from its reliance on empirical claims about the effects of the death penalty as a way to deter or incapacitate offenders (see section 3b). First, though the idea of forfeiting a right may be familiar, it leaves “troubling and unanswered questions: To whom is it forfeited? Can this right, once forfeited, ever be restored? If so, by whom, and under what conditions” (Bedau, “Capital Punishment,” 162-3)?  Second, given that the right to life is so fundamental to all rights and, as many maintain, held equally by each and all because they are humans, perhaps the right to life is exceptional or even unique in not being forfeitable at all: the right to life is actually a fundamental natural or human right.  One’s actions cannot and do not alter one’s status as a human being, Locke and Aquinas notwithstanding; thus, the right to life is inalienable and not forfeitable.  Even killers retain their right to life, the state remains bound by the correlative duty not to kill a murderer, and capital punishment, then, is a violation of the human right to life.

Developed in this way, as a matter of fundamental human rights, the merit of capital punishment becomes more about the moral standing of human beings and less about the logic and mobility of rights through forfeiture or alienation.  The point of a human right to life is that it “draws attention to the nature and value of persons, even those convicted of terrible crimes.… Whatever the criminal offense, the accused or convicted offender does not forfeit his rights and dignity as a person” (Bedau, “Reflections,” 152, 153).   This view reflects at least the spirit of the 1948 United Nations Universal Declaration of Human Rights: the right to life is universal, is rooted in each person’s dignity, and is unalienable (Preamble; Article 3).   But this view of offenders’ moral standing can be challenged if one considers the implication that, of equal standing with any of us, then, are masters of massacres or genocide (for example, Hitler, Stalin, Pol Pot), serial killers, terrorists, rampant rapists, and pedophiliac predators.  As one retributivist defender of capital punishment puts it, “though a popular dogma, the secular doctrine that all human beings have… worth is groundless.  The notion… [is] perhaps the most misused term in our moral vocabulary.… If humans do not possess some kind of intrinsic value… then why not rid ourselves of those who egregiously violate… our moral and legal codes” (Pojman, 35, 36).

d. Retributivism and Fairness

A recently revived retributivism about the death penalty builds not on individual rights, but on a notion of fairness in society.  Given a society with reasonably just rules of cooperation that bestow benefits and burdens on its members, misconduct takes unfair advantage of others, and punishment is thereby merited to address the advantage gained:

A person who violates the rules has something that others have—the benefits of the system—but by renouncing what others have assumed, the burdens of self-restraint, he has acquired an unfair advantage.  Matters are not even until this advantage is in some way erased….[P]unishing such individuals restores the equilibrium of benefits and burdens. (Morris 478)

The morally justified amount, kind, or form of punishment for a crime is then determined by an “unfair advantage principle”:

His crime consists only in the unfair advantage… [taken] by breaking the law in question. The greater the advantage, the greater the punishment should be.  The focus of the unfair advantage principle is on what the criminal gained.”  (Davis 241)

In justifying an amount of punishment, then, an unfairness principle focuses on the advantage gained, whereas the lex talionis principle attends to the harm done to another (Davis 241).

The fairness approach to punishment reflects recent uses of “the principle of fairness” as a theory of political obligation:  those engaged in a mutually beneficial system of cooperation have a duty to obey the rules from which they benefit (Rawls, 108-114).  As applied to punishment, though, its roots run also to ancient, archaic notions of justice as re-establishing an equilibrium, to Aristotle’s Nichomachean Ethics treatment of justice as requiring state corrective action to rectify the imbalances created by criminal misconduct (Book V, Chapter 4), and to G.W.F. Hegel’s claim in The Philosophy of Right that to punish “is to annul the crime… and to restore the right” (69, 331n).   Today’s popular parlance that punishment is how offenders pay for their crimes can also be seen as their paying for unfair advantages gained.

As a general approach to justifying the amount of punishment merited for misconduct, the fairness approach initially appears to work best for petty theft or possibly “free-loading” in cooperative schemes, such as penalizing tax evasion.   In such cases one can perhaps see unfair advantage gained and see the amount of punishment as tied to what is unfairly gained.  But for violent crimes such as murder, the fairness approach seems less plausible.  How does lengthy incarceration or even execution erase the unfair advantage gained, annul the crime, or  re-establish any prior balance between perpetrator and victim?  To the extent that punishment affects such things, it risks conflating retribution with restitution or restoration.  The unfair advantage principle also characterizes the wrong committed not in terms of its effects on a victim, but on third parties—society members who exercise self-restraint by obeying those norms the offender violates.  This oddly places the victim of criminal misconduct, especially for violent crimes: the person assaulted or killed is not the focus in justifying the amount of punishment, but third parties’ burdens of self-restraint are.  Additionally, taken by itself, the unfair advantage approach to establishing the proper amount of punishment can also have some odd consequences, as Jeffrey Reiman rather colorfully suggests:

For example, it would seem that the value of the unfair advantage taken of law-obeyers by one who robs a great deal of money is greater than the value of the unfair advantage taken by a murderer, since the latter gets only the advantage of ridding his world of a nuisance while the former will be able to make a new life… and have money left over for other things.  This leads to the counterintuitive conclusion that such robbers should be punished more severely… than murderers.  (“Justice, Civilization,…,” note 10)

The death penalty for murder, then, would not obviously be morally justified if the general criterion for the amount of punishment is an unfair advantage principle.

A defense of the death penalty for murder has been proposed by employing another version of this general approach to punishment.  The key is seeing the kind of unfair advantage gained by a murderer.  As Reiman suggests in the spirit of Hegelian retributivism, the act of killing another disrupts “the relations appropriate to equally sovereign individuals;” it is “an assault on the sovereignty of an individual that temporarily places one person (the criminal) in a position of illegitimate sovereignty over another (the victim)”; then there is “the right to rectify this loss of standing relative to the criminal by meting out a punishment that reduces the criminals’ sovereignty to the degree to which she vaunted it above her victim’s” (“Why…,” 89-90).   So, if a murder is committed and a life taken, the idea is that the amount of permissible punishment is for the state, as the victim’s agent, to assert a supremacy over the criminal similar to that already asserted by the killer; and to do that it is permissible for the state to impose the death penalty for murder.  So, on this interpretation of the fairness principle, the death penalty for murder is morally justified, though, for other crimes, it may not be “easy or even always possible to figure out what penalties are equivalent to the harms imposed by offenders” (Reiman, “Why…,” 69-90, 93).  As with other forms of retributivism, the fairness approach, on either interpretation, is challenged by the plausibility of using a principle that adequately addresses both the merits of capital punishment for murder and also generates a system of penalties that “fit” or are equivalent to various crimes.

e. Challenges to Retributivism

Retributivist approaches to capital punishment are many and varied.  But from even the small sample above, notable similarities are often cited as challenges for this way of thinking about the moral justification of punishment by death.   First, retributivism with respect to capital punishment either invokes principles that are plausible, if at all, only for death as penalty for murder; or it relies on principles met only with reasoned skepticism about their general adequacy for constructing a plausible scale matching various crimes with proper penal responses.

Second, retributivists presuppose that persons are responsible for any criminal misconduct for which they are to be punished, but actually instituting capital punishment confronts the reality of some social conditions, for example, that challenge the presupposition of voluntariness and, in the case of the fairness approach, that challenge the presupposition of a reasonably just system of social cooperation (see section 5b).  Third, it is often argued that, in addressing the moral merits of capital punishment, retributivists ignore or make markedly secondary the causal consequences of the practice.  What if no benefits accrue to anyone from the practice of capital punishment?  What if capital punishment significantly increases the rate of murders or violent crimes?  What if the institution of capital punishment sometimes, often, or inevitably is arbitrary, capricious, discriminatory, or even mistaken in its selecting those to be punished by death (see section 5)?  These and other possible consequences of capital punishment seem relevant, even probative.  The challenge is that retributivists ignore or diminish their importance, perhaps defending or opposing the death penalty despite such effects and not because of them.

3. Utilitarian Approaches

A utilitarian approach to justifying capital punishment appeals only to the consequences or effects of death being the penalty for serious crimes, such as murder.  A utilitarian approach, then, is a kind of consequentialism and is often said to be “forward looking,” in contrast to retributivists’ “backward looking” approach.   More specifically, a utilitarian approach sees punishment by death as justified only if that amount of punishment for murder best promotes the total happiness, pleasure, or well-being of the society.  The idea is that the inherent pain and any negative effects of capital punishment must be exceeded by its beneficial effects, such as crime prevention through incapacitation and deterrence; and furthermore, the total effects of the death penalty—good and bad, for offender and everyone else—must be greater than the total effects of alternative penal responses to serious misconduct, such as long-term incarceration.   A utilitarian approach to capital punishment is inherently comparative in this way: it is essentially tied to the consequences of the practice being best for the total happiness of the society.  It follows, then, that a utilitarian approach relies on what are, in principle, empirical, causal claims about the total marginal effects of capital punishment on offenders and others.

a. Classic Utilitarian Approaches: Bentham, Beccaria, Mill

A classic utilitarian approach to punishment is that of Jeremy Bentham.  In chapters XIII and XIV of his lengthy work, An Introduction to the Principles of Morals and Legislation, first published in 1789, Bentham addresses the appropriate amount of punishment for offenses, or, as he puts it, “the proportion between punishments and offences.”  He begins with some fundamental features of a utilitarian approach to such issues:

The general object which all law have, or ought to have in common, is to augment the total happiness of the community.… But all punishment is mischief: all punishment in itself is evil.  Upon the principle of utility, if it ought at all to be admitted, it ought only to be admitted in as far as it promises to exclude some greater evil.  (XIII. I, ii.)

Bentham continues by noting the importance of attending to “the ends of punishment”:

The immediate principal end of punishment is to control action.… [T]hat of the offender it controls by its influence… on his will, in which case it is said to operate in the way of reformation;  or on his physical power, in which case it is said to operate by disablement: that of others it can influence no otherwise than by its influence over their wills; in which case it is said to operate in the way of example. (XIII. ii. fn. 1)

So, there are three major ends of punishment related to controlling people’s action in ways promoting the total happiness of the community through crime reduction or prevention: reformation of the offender, disablement (that is, incapacitation) of the offender, and deterrence (that is, setting an example for others).   Of these three ends of punishment, Bentham says “example” – or deterrence – “is the most important end of all.” (XIII. ii. fn 1).  Since “all punishment is mischief [and] an evil,” any amount of punishment, then, is justified only if that mischief is exceeded by the penalty’s good effects, and, most importantly for Bentham, only if the punishment reduces crime by deterring others from misconduct and does so better than less painful punishments.  In other writings, Bentham explicitly applies his utilitarian approach to capital punishment, first allowing its possible justification for aggravated murder, particularly when the “effect may be the destruction of numbers” of people, and then, years later and late in life, calling for its complete abolition (Bedau, “Bentham’s Utilitarian Critique…”).

In his own writing about law, Bentham notably praises and acknowledges Cesare Beccaria’s On Crimes and Punishments, its utilitarian approach to penal reform, and its call for abolishing capital punishment. Beccaria called for abolition of the death penalty largely by appealing to its comparative inefficacy in reducing the crime rate.  In Chapter XII of his essay, Beccaria says the general aim of punishment is deterrence and that should govern the amount of punishment to be assigned crimes:

The purpose of punishment… is nothing other than to dissuade the criminal from doing fresh harm to his compatriots and to keep other people from doing the same.  Therefore, punishments and the method of inflicting them should be chosen that… will make the most effective and lasting impression on men’s minds and inflict the least torment on the body of the criminal. (23; Ch. XII)

He then argues that “capital punishment is neither useful nor necessary” in comparison to the general deterrent effects of lengthy prison sentences:

[T]here is no one who, on reflection, would choose the total and permanent loss of his own liberty, no matter how advantageous a crime might be.  Therefore, the intensity of a sentence of servitude for life, substituted for the death penalty, has everything needed to deter the most determined spirit.… With capital punishment, one crime is required for each example offered to the nation; with the penalty of a lifetime at hard labor, a single crime affords a host of lasting examples” (49-50, 51; Ch. XXVIII).

The idea here is that an execution is a single, severe event, perhaps not long remembered by others, whereas life imprisonment provides a continuing reminder of the punishment for misconduct.  In general, Beccaria says, “[i]t is not the severity of punishment that has the greatest impact on the human mind, but rather its duration, for our sensibility is more easily surely stimulated by tiny repeated impressions than by a strong but temporary movement” (49; Ch. XXVIII).

Beccaria adds to this thinking at least two claims about some bad social effects of capital punishment: first, for many the death penalty becomes a spectacle, and for some it evokes pity for the offender rather than the fear of execution needed for effective deterrence of criminal misconduct (49; Ch. XXVIII).  Second, “capital punishment is not useful because of the example of cruelty which it gives to men.… [T]he laws that moderate men’s conduct ought not to augment the cruel example, which is all the more pernicious because judicial execution is carried out methodically and formally” (51; Ch. XXVIII).  Thus, Beccaria opposes capital punishment by employing utilitarian thinking: the primary benefit of deterrence is better achieved through an alternative penal response of “a lifetime at hard labor,” and, furthermore, the cruelty of the death penalty affects society in ways much later called “the brutalization effect.”

Another major utilitarian, John Stuart Mill, also exemplifies distinctive facets of a utilitarian approach, but in defense of capital punishment.  In an 1868 speech as a Member of Parliament, Mill argues that capital punishment is justified as penalty for “atrocious cases” of aggravated murder (“Speech…,” 268).  Mill maintains that the “short pang of a rapid death” is, in actuality, far less cruel than “a long life in the hardest and most monotonous toil… debarred from all pleasant sights and sounds, and cut off from all earthly hope” (“Speech…,” 268).  As Sorell succinctly summarizes Mill’s position, “hard labor for life is really a more severe punishment than it seems, while the death penalty seems more severe than it is” (“Aggravated Murder…,” 204).  Since the deterrent effect of a punishment depends far more on what it seems than what it is, capital punishment is the better deterrent of others while also involving less pain and suffering for the offender.  Such a combination “is among the strongest recommendations a punishment can have” (Mill, “Speech…,” 269). And so, Mill says, “I defend [the death penalty] when confined to atrocious cases… as beyond comparison the least cruel mode in which it is possible adequately to deter from the crime” (“Speech…, 268).

b. Empirical Considerations: Incapacitation, Deterrence

A utilitarian approach to capital punishment depends essentially on what are, in fact, the causal effects of the practice, whether the death penalty is, in fact, effective in incapacitating or deterring potential offenders.  If, in fact, it does not effect these ends better than penal alternatives such as lengthy incarceration, then capital punishment is not justified on utilitarian grounds.   In principle, at least, the comparative efficacy of capital punishment is therefore an empirical issue.

A number of social scientific studies have been conducted in search of conclusions about the effects of capital punishment, at least in America.  With respect to the end of incapacitation, any crime prevention benefit of executing murderers depends on recidivism rates, that is, the likelihood that murderers again kill.  Recent studies of convicted murderers—death row inmates not executed, prison homicides, parolees, and released murderers—indicate that the recidivism rate is quite low, but not zero: a small percentage of murderers kill again, either in prison or upon release (Bedau, The Death Penalty, 162-182).  These crimes, of course, would not have occurred were capital punishment imposed, and, so, the death penalty does prevent commission of some serious crimes.  On the other hand, for a utilitarian, these benefits of incapacitation through execution must exceed those for possible punitive alternatives.  The data reflects recidivism rates under current practices, not other possible alternatives.  If, for example, pardons and commutations were eliminated for capital crimes, if atrocious crimes were punished by a life sentence without any possibility of parole, or if conditions of confinement were such that prison murders were not possible (for example, shackled, solitary confinement for life), then the recidivism rate might approach or be zero.  One issue, then, is how high or low a recidivism rate decides the justificatory issue for capital punishment.  Another issue is the moral permissibility of establishing conditions of confinement so restrictive that even murders in prison are reduced to nearly zero.

Since the mid-twentieth century, in America a number of empirical studies have been conducted in order to assess the deterrent effects of capital punishment in comparison to those of life imprisonment.  Scholars analyzed decades of data to compare jurisdictions with and without the death penalty, as well as the effects before and after a jurisdiction abolished or instituted capital punishment.   Such analyses “do not support the deterrence argument regarding capital punishment and homicide” (Bailey, 140).  Sophisticated statistical studies published in the mid-1970s claimed to show that each execution deterred seven to eight murders.  This exceptional study and its methodology have been much criticized (Bailey, 141-143).  Additional, more recent studies and analyses have “failed to produce evidence of a marginal deterrent effect for capital punishment” (Bailey, 155).  As indicated by Jeffrey Reiman’s succinct summary and numerous, cited literature surveys (“Why…” 100-102), nearly all relevant experts claim there is no conclusive evidence that capital punishment deters murder better than substantial prison sentences.

Determining the deterrent effects of capital punishment does present significant epistemic challenges.  In comparative studies of jurisdictions with and without the death penalty, “there simply are too many variables to be controlled for, including socio-economic conditions, genetic make-up,” demographic factors (for example, age, population densities), varying facets of law enforcement, etc.  (Pojman, 139). Numerous variables may or may not explain the data attempting to link crime rates and the death penalty in different places or times (Pojman, 139). Second, as Beccaria notes, for example, deterrent effects plausibly depend importantly on the certainty, speed, and public nature of penal responses to criminal conduct.  These factors have not been much evident in recent capital punishment practices in America, which may explain the lack of evidence revealed by recent statistical studies.  Third, deterrence is a causal concept:  the idea is that potential murderers do not kill because of the death penalty.  So, the challenges are to measure what does not occur—murders – and to establish what causes the omission—the death penalty.  The latter element is even more challenging to measure because most who do not murder do so out of habit, character, religious beliefs, lack of opportunity, etc., that is, for reasons other than any perceived threat or fear of execution by the state.  Deterrence studies, then, attempt to establish empirically a causal relationship for a small minority of people and omitted homicides within a death penalty jurisdiction.  Finally, there are disagreements about the importance of the studies’ conclusions.  For example, abolitionists typically see that, despite numerous attempts, the failure to provide conclusive evidence strongly suggests there is no such effect: the death penalty, in fact, does not deter.  Defenders of capital punishment are inclined to interpret the empirical studies as being inconclusive: it remains an open question whether the death penalty deters sufficiently to justify it.  And all this is further complicated by the fact that some studies focus on the effects of capital statutes and others look for links between actual executions and crime rates.

c. Utilitarian Defenses: “Common Sense” and “Best Bet”

Regardless of the outcomes or probative value of statistical studies, justifying capital punishment on grounds of deterrence may still have merit.  It would seem, some maintain, that “common sense” supports the notion that the death penalty deters.  The deterrence justification of capital punishment presupposes a model of calculating, deliberative rationality for potential murderers.  What people cherish most is life; what they most fear is being killed.  So, given a choice between life in prison and execution by the state, most people much prefer life and therefore will refrain from misconduct for which death is the punishment.  In short, “common sense” suggests that capital punishment does deter.  But this kind of appeal to “common sense” ignores the essentially comparative aspect of appeals to deterrence as justification: though capital punishment may deter, it may not deter any more (or significantly more) than a long life in prison. We cannot equate “what is most feared” with “what most effectively deters” (Conway, 435-436; Reiman, “Why…,” 102-106).

Another way of looking at capital punishment in terms of deterrence relies on making the best decision under conditions of uncertainty.  Given that the empirical evidence does not definitively preclude that capital punishment is a superior deterrent, “the best bet” is to employ the death penalty for serious crimes such as murder.  If capital punishment is not, in fact, a superior deterrent, then some murderers have been unnecessarily executed by the state; if, on the other hand, death is not a possible punishment for murder and capital punishment is, in fact, a superior deterrent, then some preventable killings of innocent persons would occur.  Given the greater value of innocent lives, the less risky, better option justifies capital punishment on grounds of deterrence. But the argument crucially depends on comparative risk assessments: if there is capital punishment, then certainly some murderers will be killed, whereas without the death penalty there is only a remote chance that more innocent lives would be victims of murder (Conway, 436-443).  Furthermore, the argument openly assumes that not all lives are equal—those of the innocent are not to be risked as much as those who have murdered—and that, for some, is a fundamental moral issue at stake in justifying capital punishment (see section 2c; Pojman, 35-36).

d. Challenges to Utilitarianism

Utilitarian approaches to justifying punishment are controversial and problematic, perhaps most often with respect to possibly justifying punishment of the innocent as a means to preventing crime and promoting total happiness of a society.  Even ignoring this issue and focusing only on justifying the proper amount of punishment for the guilty and the death penalty, in particular, there are concerns to be considered about a utilitarian approach.  The objection is that a utilitarian approach to the death penalty relies on a suspect general criterion—deterrence—for establishing the proper amount of punishment for crimes.  It is often argued that, for purposes of crime prevention through deterrence, a utilitarian is committed, at least in principle, to excessively severe punishments, such as torturous and gruesome executions in public even for crimes much less serious than murder (for example, Ten, 34-35, 143-145).  The idea is that the pain of excessively severe and public punishments for minor crimes is more than counterbalanced by a significant reduction in a crime rate.  It is also argued that significant crime rate reductions could perhaps be achieved, in some circumstances, by disproportionately minor punishments:  if fines, light prison sentences, or even fake executions could deter as well as actual ones, then a utilitarian is committed to disproportionately mild penalties for grave crimes.  Utilitarians respond to such possibilities by indicating additional considerations relevant to calculating the total costs of such disproportionate punishments, while critics continue creating even more elaborate, fantastic counterexamples designed to show the utilitarian approach cannot always avoid questions about the upper or lower limits of morally permissible penal responses to misconduct.  As C. L. Ten summarizes succinctly, a utilitarian approach establishing a proper amount of punishment is “inadequate to account for both the strength of the commitment to the maintenance of a proportion between crime and punishment, and [to] the great reluctance to depart… from that proportion when required to so do by purely aggregative consequential considerations” (146).

Another common criticism of the utilitarian approach points to the very structure of justifications rooted in deterrence.  As evident in Bentham’s classic statements, for example, the purpose of punishment “is to control action,” primarily through deterrence (see section 3a).  Punishments deter and “control action” by example, by the demonstration to others that they, too, will suffer similarly should they similarly misbehave. Capital punishment, then, aims to deter actions of potential killers by inflicting death on actual ones: the technique works by threat, by instilling fear in others.  A fundamental objection to this way of thinking is to see that, in effect, persons are being used as a means to controlling others’ actions; capital offenders are being used simply as a means to deter others and reduce the crime rate.  Such a use of persons is morally impermissible, it is argued, echoing Immanuel Kant’s famous categorical imperative against treating any person merely as means to an end.  No gain in deterrence, incapacitation, or other beneficial effects can justify deliberately killing a captive human being as a means to even such desirable ends as deterring others from committing grave crime.  The argument, then, is that justifying capital punishment on grounds of deterrence is a morally impermissible way to treat persons, even those found to have committed atrocious crimes.

e. Other Consequential Considerations

In discussions of capital punishment, it is deterrence that receives much of the attention for those exploring a utilitarian approach to the moral justification of the practice.  There are, however, other significant consequences of the death penalty that are relevant, as noted even by classic utilitarians.  Beccaria, for example, asserts a brutalization effect on society: executions are cruel and are examples to others of the states’ cruelty.  The suggestion seems to be that capital punishment increases people’s tolerance for another’s suffering, their callousness about human suffering, a willingness to impose suffering on another, even the rate of violent crimes (for example, assaults or homicides).  In contrast, one recent defender of the death penalty, Jeffrey Reiman, argues that, for some developed societies, abolition of capital punishment for serious crimes shows restraint and thereby actually advances civilization by reducing our tolerance for others’ suffering.  Such claims are, in principle, empirical ones about the causal effects of the practice of capital punishment.  As with recent deterrence studies, there is no clear empirical evidence of any brutalizing or civilizing effects of capital punishment.

For classic utilitarian thinking, another important consequence of punishment is its effect on the offender.   According to Jeremy Bentham, one of the three ends of punishment is reform of the offender through “its influence on his will” (XIII.ii. fn. 1).  This penal aim of reform (or rehabilitation) may suggest capital punishment is not justifiable for any crime.  But that need not be the case.  The ancient Roman Stoic Seneca, for example, argues that proper punishment for criminal misconduct depends on its “power to improve the life of the defendant” (Nussbaum, 103).   But he also defends capital punishment as a kind of merciful euthanasia: execution is “in the interest of the punished, given that a shorter bad life is better than a longer one” (Nussbaum, 103, note 43).  Plato also defends capital punishment by looking to its impact on the offender.  In his later works and as part of a general theory of penology, Plato maintains that the primary penal purpose is reform—to “cure” offenders, as he says.  For crimes that show offenders are “incurable,” Plato argues execution is justifiable.  In his late work, The Laws, Plato explicitly prescribes capital punishment for a wide range of offenses, such as deliberate murder, wounding a family member with the intent to kill, theft from temples or public property, taking bribes, and waging private war, among others (MacKenzie; Stalley).  In a utilitarian approach to capital punishment, then, attending to the end of reforming offenders need not be irrelevant to possible moral justifications of the death penalty.

4. Capital Punishment as Communication

A cluster of distinctive approaches to issues of justifying punishment and, at least by implication, the death penalty, are united by taking seriously the idea of punishment as expression or communication.  Often called “the expressive theory of punishment,” such approaches to punishment are sometimes classified as utilitarian or consequentialist, sometimes as retributivist, and sometimes as neither.  The root idea is that punishment is more than “the infliction of hard treatment” by an authority for prior misconduct; it is also “a conventional device for the expression of attitudes of resentment and indignation, and of judgments of disapproval and reprobation….  Punishment, in short, has a symbolic significance” (Feinberg, “The Expressive Function…,” 98).  Hard treatment, deprivations, incarceration, or even death can be, and perhaps are, vehicles by which messages are communicated by the community.  To see capital punishment as a deterrent is to see it as communicative:  the death penalty communicates to the community—at least potential killers—that murder is a serious wrong and that execution awaits those who kill others.  Various developments of punishment as communication, though, attend to other messages expressed, some emphasizing the sender and others the recipient of the message.

One version of this kind of approach emphasizes that, with capital punishment, a community is expressing strong disapproval or condemnation of the misconduct.  Sometimes called “the denunciation theory,” the basic contention is evident in Leslie Stephens’ late 19th-century work, Liberty, Equality, Fraternity (a reply to J.S. Mill’s On Liberty), as well as by the oft-quoted remarks of Lord Denning recorded in the 1953 Report of the Royal Commission on Capital Punishment:

The punishment for grave crimes should adequately reflect the revulsion felt by the great majority of citizens for them. It is a mistake to consider the object of punishment as being deterrent or reformative or preventive and nothing else.… The ultimate justification of any punishment is not that it is a deterrent but that it is the emphatic denunciation by the community of a crime; and from this point of view, there are some murders which, in the… public opinion, demand the most emphatic denunciation of all, namely the death penalty. (As quoted in Hart, “Punishment…,” 170)

In the United States, Supreme Court decisions in death penalty cases have more than once employed such reasoning:  a stable, ordered society is better promoted by capital punishment practices than risking “the anarchy of self-help, vigilante justice, and lynch law” as ways of expressing communal outrage (Justice Stewart, in Furman v. Georgia (1972), as quoted in Gregg v. Georgia (1976)).

As a defense of capital punishment, at least, this “denunciation theory” leaves multiple questions not adequately addressed.  For example, the approach presupposes some moral merit to popular sentiments of indignation, outrage, anger, condemnation, even vengeance or vindictiveness in response to serious misconduct.  There are significant differences between expressing such emotions and punishing justly or morally (see section 2b).  Secondly, the structure of the thinking seems entirely consequentialist or utilitarian: capital punishment is justified as effective means to communicate condemnation, or to satisfy others’ desires to see someone suffer for the crime, or as an outlet for strong, aggressive feelings that otherwise are expressed in socially disruptive ways.  Such utilitarian reasoning would seem to justify executing pedophiles or even innocent persons in order to communicate condemnation or avoid an “anarchy of self-help, vigilante justice, and lynch law.” On the other hand, even Jeremy Bentham argues that “no punishment ought to be allotted merely to this purpose” because such widespread satisfactions or pleasures cannot ever “be equivalent to the pain… produced by punishment” (Bentham XIII. ii. fn. 1).  Third, it leaves unanswered why the expression of communal outrage—even if morally warranted—is best or only accomplished through capital punishment.  Why would not harsh confinement for life serve as well any desirable expressive, cathartic function?  Or on what grounds are executions not to be conducted in ways torturous and prolonged, even publicly, as means of better communicating denunciation and expressing society’s outrage about the offenders’ misconduct?  And does not the death penalty also express or communicate other, conflicting messages about, for example, the value of life?  As a justification of capital punishment, even for the most heinous of crimes, a “denunciation theory” faces significant challenges.

Other uses of the idea of punishment as communication focus not on the sender of the message, but on the good of the intended recipient, the offender.  Punishment is paternalistic in purpose: it aims to effect some beneficial change in the offender through effective communication.  In Philosophical Explanations Robert Nozick, for example, holds that punishment is essentially “an act of communicative behavior” and the “message is: this is how wrong what you did was” (370).  Wrongdoers have “become disconnected from correct values, and the purpose of punishment is to (re)connect him” (374).  The justified amount of punishment, then, is tied to the magnitude of the wrong committed (363): “for the most serious flouting of the most important values… capital punishment is a response of equal magnitude” (377).  But, Nozick maintains, the aim of punishment is not to have an effect on the offender, but “for an effect in the wrongdoer: recognition of the correct value, internalizing it for future action—a transformation in him” (374-5).  This paternalistic end seems to preclude the death penalty being imposed for any kind of wrongdoing; however, in “truly monstrous cases” (for example, Adolph Hitler, genocides) there seems to be perhaps the highest magnitude of wrong, a disconnection from the most basic values, and acts worthy of the most emphatic penal expression possible.  As Nozick himself admits and others have noted, this approach to punishment as communication provides “no clear stable conclusion… on the issue of an institution of capital punishment” (378).

Some employing a similar reliance on punishment as communication are less ambivalent about its implications for the death penalty.   The “moral education theory of punishment,” its proponent maintains, precludes “cruel and disfiguring punishments such as torture or maiming,” as well as “rules out execution as punishment” (Hampton, 223).  This argument for death penalty abolition takes seriously the expressive, communicative function of punishments: as aiming to effect significant benefits in and for the offender and, through general deterrence and in other ways, as “teaching the public at large the moral reasons for choosing not to perform an offense” (Hampton, 213).  Punishment as education is not a conditioning program; it addresses autonomous beings, and the moral good aimed at is persons freely choosing attachment to that which is good.  Executing criminals, then, seems to require judging them as having “lost all their essential humanity, making them wild beasts or prey on a community that must, to survive, destroy them” (Hampton 223).  Furthermore, it is argued, capital punishment conveys multiple messages, for example, about the value of a human life; and, it is argued, since one can never be certain in identifying the truly incorrigible, the death penalty is morally unjustified in all cases.   As R.A. Duff puts the abolitionist point in Punishment, Communication, and Community (2001), “punishment should be understood as a species of secular penance that aims not just to communicate censure but thereby to persuade offenders to repentance, selfreform, and reconciliation” (xvii-xix).

Approaches to capital punishment as paternalistic communication are challenged on several grounds.  First, as a general theory of punishment, such expressive theories posit an extraordinarily optimistic view of offenders as open to the message that penal experiences aim to convey.  Are there not some offenders who will not be open to moral education, to hearing the message expressed through their penal experiences?  Are there not some offenders who are incorrigible?  On these approaches to capital punishment, the reasons against executing serious offenders are essentially empirical ones about the communicative effects on the public of executions or the limits of diagnostic capabilities in identifying the truly incorrigible.  Second, with respect to capital punishment, perhaps for some offenders, the experience of trial, sentencing, and awaiting execution does successfully communicate and effect reform in the offender, with the death penalty then imposed to affirm that which effected the beneficial reform in the offender.  Third, as with other approaches to punishment, the moral education theory renders it extremely difficult, if not impossible, to “fashion a tidy punishment table” pairing kinds of misconduct and merited penalties (Hampton, 228).  Focusing on reforming or educating a recipient of a message suggests very individualistic and situational sentencing guidelines.  Not only may this not be practical, such discretion in sentencing risks caprice or arbitrariness in punishing offenders by death or in other ways (see section 5); and it challenges the fundamental, formal principle of justice, that is, that like case be treated alike.  Finally, the implications of these approaches to punishment are quite at odds with the system of incarceration employed so universally for so many offenders.  The implications of punishment as communication aimed at the offender would require radical revisions of current penal practices, as some proponents readily admit.

5. The Institution of Capital Punishment

Much philosophic focus on punishment and the death penalty has been rooted in theoretical questions and principles.  A result is that philosophers have mostly ignored more practical matters and moral facets of the institution of capital punishment.  That historical tendency began to change in the mid-twentieth century with a decidedly American concern: whether the practice of capital punishment is legally permissible, given the United States Constitution’s eighth amendment prohibition of “cruel and unusual punishments.”  Scholars and lawyers investigated the history and continuing death penalty practices in America, producing evidence of racial discrimination in the institution of capital punishment, especially in southern states.  By the early 1970s, a series of United States Supreme Court decisions established especially elaborate criminal procedures to be followed in capital cases: bifurcated trials (one for conviction and one for establishing the sentence), a finding of at least one aggravator for a murder to be a capital crime, automatic appellate review of all sentences to death, guidelines for jury selections, etc. The aim of such “super due process” is to improve criminal procedures employed in capital cases so as avoid arbitrariness in administering the death penalty in America (Radin).

After implementation of these Court-mandated procedures for death penalty cases, a number of empirical studies indicated continuing concerns and problems with the practice of capital punishment in America.  For example, studies of capital cases conducted in some southern states showed that disproportionately large numbers of convicted murderers received death sentences if they were black, a disproportion even greater when the convicted murderer was black and the victim was white (Bedau, The Death Penalty, 268-274).   Also, especially with the advent of new, scientific sources of evidence (for example, DNA matching), studies suggest that numbers of persons innocent of any crime have been wrongly convicted, sentenced, and even executed for committing a capital crime (Bedau, The Death Penalty, 344-360).   Morally justifying punishment in theory is distinguishable from whether it is justified in practice, given extant conditions.  For some, even though questions of theory and practice are distinguishable, they may not be unrelated. As Stephen Nathanson asks, “does it matter if the death penalty is arbitrarily administered?”

a. Procedural Issues: Imperfect Justice

Moral arguments about the death penalty based on procedural issues attend to the outcomes and steps of a long and involved process “as a person goes the road from freedom to electric chair” (Black, 22).  Such a process involves an “entire series of decisions made by the legal system”:  whether to arrest; what criminal charges to file; decisions about plea bargaining offers, if any;  a criminal trial, with jury selection, countless tactical decisions, possible employment of a defense like insanity; sentencing that requires juries find and weigh statutory factors of aggravation and mitigation; post-conviction appeals and possible remedies decided; clemency decisions, to commute a sentence or even pardon the convicted (Black, 22-26).  It is apparent, then, “that the choice of death as the penalty is the result of not just one choice… but of a number of choices, starting with the prosecutor’s choice of a charge, and ending with the choice of the authority… charged with the administration of clemency” (Black, 27).  At each one of these points of decisions, it is argued, there is room for arbitrariness, mistakes, even discrimination.  Furthermore, it is impossible and undesirable to remove all latitude, all discretion, in order to allow each of these decisions to be properly made in light of the particularities of the case, person, situation.  And so, the institution of capital punishment, even as practiced in America, brings along with it “the inevitability of caprice and mistake” (Black).

A criminal trial and, more broadly, criminal procedures in toto are exemplars of what John Rawls, in A Theory of Justice, characterizes as imperfect procedural justice.   There is an independently defined standard external to the procedure by which we judge outcomes of the process; and there is no procedure “that is sure to give the desired outcome” (Rawls 74-75).  For criminal procedures, the aim is “to impose deprivations on all and only guilty convicted offenders because of their wrongdoing”; and for capital punishment, the aim is to impose the death penalty on all and only those guilty of committing crimes for which the merited amount of punishment is execution (Bedau, Reflections 173).  In capital procedures, too, it is “impossible to design the legal rules so that they always lead to the correct result” (Rawls, 75).  Whether due to inherent vagaries of legal language, the necessity of discretion to judge properly complex, particular cases, the fallibility of human beings, or political pressures and other factors affecting decisions made within the system, such as clemency, the risk of error is not eliminable for the institution of capital punishment.  Given unavoidably imperfect criminal justice procedures, at issue, then, is the moral import of any arbitrariness, caprice, mistake, or discrimination in the institution of capital punishment.

The appeal to procedural imperfections is often employed by those opposed to capital punishment and who seek its complete abolition on the grounds that its institution is intolerably arbitrary, capricious, or discriminatory in selecting who lives and who dies. This abolitionist reasoning is challenged in various ways.  Given the fact that there are imperfections in the system or practice of capital punishment, what follows is not abolition of the death penalty, but justification only for procedural improvements in order to reduce problematic outcomes.  A second issue, aside from disputes about the actual frequency of problematic outcomes, is a question of thresholds: how many imperfect outcomes are tolerable in the institution of capital punishment?  Abolitionists tend to have near-zero tolerance, whereas some defenders of capital punishment argue that some arbitrariness is acceptable.  For a utilitarian approach to capital punishment, assessing the total consequences—benefits and “costs”— of the death penalty must include the inevitable arbitrariness of its institution.  And in as much as any deterrent effects are linked to certainty of punishment, any degree of arbitrariness in administering capital punishment does affect a central utilitarian consideration in determining whether the institution is morally justified.  For retributivist approaches, the question is whether some arbitrariness in the institution violates requisite pre-conditions for morally justifying the institution of capital punishment (see section 2c).  Jeffrey Reiman, for example, argues, on retributivist grounds, that capital punishment is justified in principle; however, “the death penalty in… America is unjust in practice,” and he therefore favors abolition (see 5b).

A third issue for appeals to procedural imperfections involves limiting the scope of the argument for abolition.   Since all criminal cases are administered through unavoidably imperfect procedures, if arbitrariness justifies abolishing the death penalty for murder, then it would seem also to justify abolishing lesser punishments for less serious criminal misconduct.  In short, the imperfect administration of capital punishment matters morally only if the death penalty is distinctive among punishments.  Punishment by death is often said to be distinctive because, unlike incarceration, death is irrevocable.  But years spent imprisoned, for example, can also not be revoked, once they have been endured.  The idea must be that incarceration, if found to be mistaken, can be ceased: by executive or judicial action the imprisoned can be released and receive remedies, even if only gestures.   On the other hand, a death sentence, once executed, has none of those qualities: death is permanent; punishment by death has finality.  “Because of the finality and the extreme severity of the death penalty, we need to be more scrupulous in applying it as punishment than is necessary with any other punishment” (Nathanson, Eye, 67).

Another major issue involves distinguishing the kinds of imperfect outcomes resulting from the criminal procedures employed in capital cases.  For example, the arbitrariness evident in the procedures may be one of selectivity: among all the convicted killers who merit a death sentence, some of those are actually sentenced or executed and others are not.  As Ernest van den Haag argues, that some who merit the death penalty escape that punishment does not make morally unjustified selectively executing some who do merit that punishment (Nathanson, 49).  Analogies with selective ticketing for excessive speed support this kind of reasoning: justice is a matter of each individual being treated as they merit, without regard to how other, similar cases are treated.  But this argument makes what is just or justified entirely non-comparative, when substantive comparative considerations often are also necessary when arbitrariness or discrimination is at issue (Feinberg, “Noncomparative Justice,” 265-269).  Justice requires treating similar cases in similar ways, and this kind of arbitrary imposition of the death penalty violates that requirement.  Furthermore, it may matter morally what are the grounds of selecting only some convicted killers to receive death sentences or to be executed.  If the selectivity is based on race, for example, then the moral import of the arbitrariness might be far greater, whether for traffic tickets or the death penalty for murder.  Aside from the moral import of arbitrariness as selectivity, there is also an arbitrariness that issues in mistakes, where persons who did not commit a capital crime (or perhaps did not commit any crime at all) are wrongly convicted, sentenced and executed.  This sort of imperfect outcome would seem far more problematic morally than the selective execution of only some of those who merit the death penalty.  As Stephen Nathanson states it with respect to executing the innocent, “this is the moral force of the argument from arbitrary judgment” (Eye, 53).

b. Discrimination: Race, Class

Criminal justice systems that administer the death penalty operate in the context of a society that may or may not itself be entirely just.  The procedures employed in capital cases, then, can be imperfect due to external social factors affecting its outcomes, and not only due to features internal to the structure of a legal system itself.  Various sources of data suggest to many that American criminal justice procedures produce disproportionately large numbers of capital convictions and death sentences for the poor and for African-Americans.  In short, it is claimed, the institution of capital punishment is imperfect, capricious, or arbitrary in a particular way: it discriminates on the basis of economic class and race.   Poverty and race, it is argued, have “warping effects” on the long, involved process whereby “a person goes the road from freedom to electric chair” (Black, 22).   At numerous decision points, a lack of funds affects how the process proceeds for a poor person charged with a capital crime: the quality of legal counsel for plea bargaining, investigation, and conduct of a trial; financial resources needed to build a strong evidentiary case through crime scene investigation, forensic testing, and expert testimony at trial;  money for background investigations, professional examinations, and expert testimony in the crucial sentencing phase of a capital trial; securing attorneys for legally required and elective appeals; accessing those political offices and officers with the legally unlimited authority to commute a sentence or even pardon a convicted offender.   Given the high correlation in America between poverty and race, any disproportionate outcomes with respect to economic class parallel those with respect to race.  Also, as described above, the “entire series of decisions made by the legal system” in capital cases provides numerous opportunities for unconscious racial bias or blatant discrimination in the exercise of discretion by those administering the process.  Opponents of the death penalty, then, see factors of race and poverty as increasing the likelihood of error in capital cases, and see such discriminatory outcomes as especially problematic from a moral point of view.

This line of reasoning invokes the specter of discrimination in the institution of capital punishment.  The basic empirical claim is that, by race and economic class, America’s imperfect procedures produce disproportionate outcomes.  The issue is not necessarily one of intentional racial discrimination, though that may occur, as well.  Considerations of perhaps unintended discriminatory outcomes, however, need not support abolition of the death penalty.  Aside from disputes about the data supporting the basic empirical claim of disproportionate outcomes, responses parallel those reviewed above with respect to the internal structures of criminal justice procedures in capital cases (see section 5a).  In particular, it is argued that disproportionate outcomes support reforms to mitigate such discrimination, such as quality legal representation being provided for the poor, increased budgetary allegations for defense of the indigent in capital cases, etc. And given that what explains the disproportionate outcomes are social conditions external to the process itself, it would seem that discriminatory outcomes are not inevitable in the way that the effects of ineliminable discretion might be.  The issue, then, becomes the moral import of problematic social conditions that “warp” the institution of capital punishment.  How does such “warping” affect any justification of the death penalty?  Does it matter morally that the institution of capital punishment exists amidst a society insufficiently just regarding matters of economic class or race?

For a utilitarian approach to capital punishment, the issue is addressed in terms of total consequences for the society.  As with other kinds of arbitrariness previously reviewed, any discriminatory outcomes of the institution of capital punishment are part of the total cost of the practice and are to be considered along with all other costs and benefits.  Depending on the causal consequences of the practice in a society at a given time, then, capital punishment is or is not morally justified.  For some retributivists, however, the relevance of current social conditions can be quite different for whether capital punishment is morally justified.  For example, the fairness approach to punishment and the death penalty presupposes a society with reasonably just rules of cooperation that bestow benefits and burdens on its members. Whether America today, for example, satisfies such a pre-condition is, for some, doubtful; and thus, it is argued, even if justified in theory, capital punishment is not justified under current social conditions (for example, Reiman).  Also, retributivists typically presuppose punishment is to address misconduct that is voluntary, a matter of free choice.  But Marx, for example, maintains that such a presupposition of free will is simply false, a delusion:

Is it not a delusion to substitute for the individual with his real motives, with multifarious circumstances pressing upon him, the abstraction of “free will”…?  Is there not a necessity for deeply reflecting upon an alteration of the system that breeds these crimes, instead of glorifying the hangman who executes a lot of criminals to make room for the supply of new ones?

Though Marx is himself sympathetic to a retributivist justification of punishment, theory and practice cannot be divorced.  Marx and many Marxists oppose capital punishment because it is inapplicable to the actual conditions of society where criminality is rooted in structural inequalities of wealth (Murphy).  Thus, for some retributivist and utilitarian approaches to capital punishment, the death penalty may be morally unjustified because of inherently imperfect legal procedures, morally problematic outcomes, or the social conditions surrounding the institution.

c. Medicine and the Death Penalty

In recent years, issues of medical ethics have been a facet of philosophic focus on the institution of capital punishment, especially in America.  Health care professionals—including physicians—can be active participants in the actual execution of a death-row prisoner.  Medical expertise needed for an execution itself can include administering medicines or psychiatric treatments to calm the condemned, judging whether intramuscular or intravenous techniques are best, or actually injecting a lethal dose of drugs to bring about a death (Gaie, 1).  Even if not directly participating in executions and regardless of the method of execution employed, health care professionals can be involved by providing capital trial testimony related to findings of guilt or punishment, such as competency to stand trial, possibly exculpating mental illness, or forensic analyses of murder scene evidence.  Physicians are needed to certify death following a successful execution, and they may have a role in possible organ donations arranged by the deceased (Gaie, 2).  All such participation requires relevant expertise and is important to contemporary death penalty practices.  An important question, however, is whether it is morally permissible for health care professionals to be involved or participate in the institution of capital punishment.

A common assumption is that health care professionals—physicians, at least—have significant moral duties to those they treat or administer to.  Many, like Gaie, address such issues of professional ethics as independent of the morality of capital punishment itself.  Thus, for example, since physicians have a duty to minimize suffering, it would seem to follow that medical professionals’ participation is morally justified for that purpose, perhaps especially in executions by lethal injection.  Others maintain that, analogous to relieving the suffering of a torture victim so that they can be further tortured, physicians ought not participate in executions in order to reduce the suffering of the condemned (Dworkin).  Physician participation in an unjust practice, such as capital punishment, makes them complicit and, so, they ought not be involved. Thus, it is argued, one cannot separate the ethics of physicians’ participation in capital punishment from the moral merits of the institution itself (Litton).

Since the early 1980s, lethal injection has almost completely replaced electrocution as the preferred method of execution for those convicted of a capital crime and sentenced to death in the United States.  This recent, novel method of execution has itself generated considerable controversy.  First, unlike other constitutionally permissible modes of execution in America (that is, electrocution, hanging, firing squad, gas inhalation), a lethal injection requires medical expertise in order to be administered properly.  Thus, health care professionals must be direct participants in executions: for example, by preparing the lethal drug dosages, by establishing suitable sites for an injection, and by actually administering the drugs that cause the death of the convicted.   In comparison to other methods of execution, such participation is more essential, more direct, and ethically more problematic.  Execution by lethal injection makes more acute and controversial the ethical issues surrounding the involvement of health care professionals in the institution of capital punishment.  Second, whether employing the typical three-drug “cocktail,” or some variant of that process, acquiring the designated pharmaceuticals has often become difficult or impossible.  Some foreign-based companies face legal restrictions on exporting drugs for such uses, and some foreign and domestic drug companies, for reasons of public image or ethical considerations, for example, choose not to manufacture or supply their pharmaceutical products for use in executions.  This sometimes delays execution or leads governments to employ alternative drugs for which there may not be sufficient evidence of their effectiveness in effecting a human death.  Third, whether any formulas for lethal injections are a humane way (or a more humane way) of causing death is itself controversial, with disputes about the science (or lack thereof) behind the drug formulas and protocols used, disagreements about the evidentiary significance of physiological data from autopsies used to assess the humanity of death by lethal injection, etc.  Finally, so-called “botched executions” are still not entirely avoided by using lethal injection rather than electrocution or hanging, for example.  Cases do occur where the condemned endure an extended process of dying that sometimes suggests lingering sentience, discomfort, or suffering.  As with other facets of the institution of the death penalty, there is disagreement about the import of such practical challenges for the moral justification of capital punishment.

d. Costs: Economic Issues

At least in popular discourse, if rarely among philosophic discussions, considerations of monetary cost are adduced with respect to morally justifying capital punishment.  As Stephen Nathanson rightly recognizes, in its bald form it is a simple economic argument:  the state ought to execute murderers because it is less costly than imprisoning them for life (Eye, 33).  Even among proponents, though, cost considerations are perhaps plausibly relevant only as secondary, subsidiary supplements to some anterior justification for executing murderers: if murderers merit death as punishment for criminal misconduct, then economic cost is perhaps relevant to justifying their execution over a sentence of life spent in prison.

The argument depends crucially on the empirical claim that, in fact, it is less costly to execute murderers than it is to imprison them for life.  But the facts do not support this supposition.  The costs are not only those of a single execution, but for a system of due process and an infrastructure of facilities and personnel needed for the institution of capital punishment (Nathanson, Eye 36).  A possible reply is that such costs could be reduced, especially if we were to replace America’s elaborate “due process” for capital cases with something much more minimal: fewer appeals and appellate reviews, for example (Nathanson, Eye 38).  Such an approach may save some economic costs but increase the cost of thereby perhaps increasing the frequency of mistakes or arbitrariness.  Furthermore, reliance on comparative costs in determining who is executed potentially introduces a novel, morally suspect kind of arbitrariness.  Given that the cost of life imprisonment would be a function of a convicted murderer’s health and age, younger, healthier persons would be selected for the death penalty, while older, or more feeble, unhealthy killers would be sentenced to life in prison as the cheaper alternative.  The costs argument risks introducing a kind of age and medical status discrimination into the imperfect procedures employed to determine who merits the death penalty for murder.

6. State Authority and Capital Punishment

Exploring fully whether capital punishment is morally justified leads to considering a normative account of the modern state, its foundations, proper functions, and penal powers.  The modern practice of capital punishment presupposes a state which has the authority to make, administer, and enforce criminal law and procedures and then, if merited, impose the death penalty to address serious misconduct.  On what basis does the state possess the authority to punish by death?  This question of justification seems to raise issues about capital punishment that are “more squarely within the province of political philosophy” (Simmons, 311).

Contractarian accounts of the state share the feature that authority is derived from or constructed out of the authority granted to it by individuals that have or would “contract” to create it (see Social Contract Theory).  Any authority of the state to punish by death is, then, consent-based.  Thus, for example, as with others in the natural rights tradition, John Locke’s contractarian approach grounds state authority in individuals transferring their pre-political right to punish (including by death) those who have violated another’s basic rights by killing.   As Locke maintains in his Second Treatise on Government, the purpose of the state is to protect individuals’ basic rights, and individuals each grant the state the authority to protect rights through laws and punishments that are effective and comply with natural law principles about the amount of punishment (that is, lex talionis).  Though invoking such a pre-political right of individuals to punish is common in the natural rights tradition, and though there are some recent defenders of such an approach among libertarians (for example, Nozick), Locke himself admits that the notion of a natural executive right to punish “will seem a very strange doctrine to some men” (Treatise, sec. 9).

The classic contractarian theories of Jean-Jacques Rousseau and Thomas Hobbes also justify state authority to punish by death on grounds of individuals’ consent.  In the Leviathan, the pre-political state of nature is famously characterized by Hobbes as a life “solitary, poor, nasty, brutish, and short” (89; Ch. 13).  This life in the state of nature is so insecure that each person, as a means to self-preservation, authorizes the created sovereign power—the state—to punish by death criminal misconduct “to the end that the will of men may thereby better be disposed to obedience” (214; Ch. 28).  Rousseau, in On the Social Contract, holds that “the social treaty has as its purpose the conservation of the contracting parties,” each of whom wills the means to end of preserving his life.  “And whoever wishes to preserve his own life at the expense of others should also give it up for them when necessary….  It is in order to avoid being the victim of an assassin that a person consents to die, were he to become one” (35; Book II, Ch. v).  And so, Rousseau maintains, the political society has the right to put to death, even as an example, those who cannot be preserved without danger to others or the society itself.  In the case of all the classic social contract theories of the state, individuals’ consent to the practice of capital punishment is included in the created authority of the state to rule and to punish.

Some more recent contractarian accounts of state authority to punish are explored in the spirit of John Rawls’s A Theory of Justice, with its Kantian conceptions of rationality and basic human goods (for example, liberties, autonomy, dignity).  The general idea is that a system of social cooperation is just if it would be consented to by rational, mutually disinterested individuals making their choice while ignorant of particularities about themselves and their own place in the system.  Such contractarian approaches typically support a penal system which merges both retributivist and utilitarian approaches in establishing a just system of punishment.  Whether such contractarian approaches justify capital punishment depends, as do classic social contract theories, on the details of the conditions under which a rational choice would be made.  A recent proponent of a contractarian theory of punishment, for example, argues that individuals would consent to an institution only if it would leave individuals better off than they would be in its absence.  This “benefit principle,” it is argued, justifies a system of punishment, as each would be better off with punitive sanctions than without.  As to capital punishment, though, “[c]an a person who receives the death penalty… regard himself as better off… than he would have been had he never agreed to the contract in the first place” (Finkelstein, “A Contractarian Approach…,” 216)?  There is a paradoxical air to individuals consenting to a system whereby they may be executed.  Finkelstein argues that, even if the death penalty deters, the benefit principle is not satisfied by a system of punishment that includes the death penalty.  On this contemporary contractarian theory, then, capital punishment is not justified because it would not be agreed to by rational individuals choosing the social institutions under which they would live.

A quite different approach to justifying state authority to punish by death appeals to the idea of societal self-defense or self-protection.  In a short piece, “On Punishment,” John Stuart Mill says, “the only right by which society is warranted in inflicting any pain upon any human creature, is the right of self-defense…. Our right to punish, is a branch of the universal right of self-defence”(79).  One recent development of this approach argues that a societal right of self-protection entails the right to threaten punishment for misconduct, and that a right to impose punishments follows from the society’s right to threaten sanctions (Quinn).  Whether a society has a right to threaten or impose a death penalty for murder, then, is based on its efficacy for deterrence and incapacitation, that is, as a protector of society.  A second, slightly different argument appeals more directly to the model of individual self-defense as a right.  Just as an individual has a right to use deadly force to address imminent, unavoidable aggression against self or other innocent parties, so society, as a collective, has a right to employ deadly force to address violent aggression against innocent third parties within that society.  The amount of punishment that society has the right to employ is constrained as it is for an individual’s moral right of self-defense: the response must be proportionate to the threatened loss.  So, given a moral right of individuals to employ deadly force in defense of their own or other innocents’ lives, by analogy society has such a right to use death as a punishment for murders of innocent third parties in the society.  Whether as an exercise of a right of self-protection or self-defense, the state then has the right to institute capital punishment for serious crimes such as murder.

7. References and Further Reading

a. Primary Sources

  • Aquinas, Thomas. Summa Theologiae. (1271-1272)
    • References to this extensive work are by number of question and article in the second part of part two (i.e., II-II), available at http://www.gutenberg.org/cache/epub/18755/pg18755.html.
  • Beccaria, Cesare. On Crimes and Punishments. (1764) Trans. David Young. Indianapolis: Hackett Publishing Company, 1986.
    • Quotations and references are by page number and chapter number to this translation and edition.
  • Bentham, Jeremy. An Introduction to Principles of Morals and Legislation (1789, 1823).
    • References to this classic text are by chapter and section number.
  • Camus, Albert. “Reflections on the Guillotine.” Resistance, Rebellion, and Death. Trans. Justin O’Brien. New York: Knopf, 1966. 175-234.
  • Hegel, G.W.F. The Philosophy of Right. (1821) Trans. T. M. Knox. Oxford: Clarendon Press, 1962.
  • Hobbes, Thomas. Leviathan. (1651) Edited by Richard Tuck. Cambridge: Cambridge University Press, 1991.
    • References to this text are by pagination in this edition, followed by chapter number, to allow reliance on various translations and editions available in print or on-line.
  • Kant, Immanuel. The Metaphysical Elements of Justice, Part I of The Metaphysics of Morals. (1797) Translated by John Ladd. Indianapolis: Bobbs-Merrill, 1965.
    • Quotations and parenthetical references are from this translation and edition, followed by the standard AK pagination, to allow reliance on various translations and editions available in print or on-line.
  • Locke, John. Two Treatises of Government. (1690) Ed Peter Laslett. Cambridge: Cambridge University Press, 1988.
    • Quotations are from this recent scholarly edition; all references are to section number of The Second Treatise, to allow reliance on various other editions available on-line or in print.
  • Marx, Karl. “Capital Punishment.” New York Tribune. 1853. https://www.marxists.org/archive/marx/works/1853/02/18.htm.
  • Mill, John Stuart. ”Speech in Favor of Capital Punishment 1868.” The Collected Works of John Stuart Mill, Vol. XXVIII.: Public and Parliamentary Speeches. Eds. John M. Robson and Bruce Kinzer. Toronto: University of Toronto Press, 1988. pp. 266-273. http://oll.libertyfund.org/titles/mill-the-collected-works-of-john-stuart-mill-volume-xxviii-public-and-parliamentary-speeches-part-i.
  • Mill, John Stuart. “On Punishment.” The Collected Works of John Stuart Mill, Vol. XXI: Equality, Law, and Education. Ed. John M. Robson. Toronto: University of Toronto Press, 1984, pp. 77-79. http://oll.libertyfund.org/titles/mill-the-collected-works-of-john-stuart-mill-volume-xxi-essays-on-equality-law-and-education.
  • Plato. The Collected Dialogues. Ed. Edith Hamilton and Huntington Cairns. Princeton: Princeton University Press, 1961.
  • Ross, W.D. The Right and the Good. Oxford: Oxford University Press, 1930.
  • Rousseau, Jean Jacques. On the Social Contract. (1762) Trans. Donald A. Cress. Indianapolis: Hackett, 1987.
    • Quotations and references are to this translation and edition, using page number followed by book and chapter number, to allow reliance on various translations and editions available in print or on-line.

b. Secondary Sources

  • Bailey, William C. and Ruth D. Peterson. “Murder, Capital Punishment, and Deterrence: A Review of the Literature.” The Death Penalty in America: Current Controversies. Ed. Hugo Adam Bedau. Oxford: Oxford University Press, 1997. 135-161.
  • Banner, Stuart. The Death Penalty: An American History. Cambridge: Harvard University Press, 2003.
    • An excellent, thoughtful, and readable rendition of the long history of death penalty law and practice in America, from colonial beginnings through the end of the 20th century.
  • Bedau, Hugo Adam. “Bentham’s Utilitarian Critique of the Death Penalty.” Journal of Criminal Law and Criminology 74 (1983): 1033-1065.
  • Bedau, Hugo Adam. “Capital Punishment.” Matters of Life and Death: New Introductory Essays in Moral Philosophy. Third edition. Ed. Tom Regan. New York: Random House, 1980. 160-194.
  • Bedau, Hugo Adam, ed. The Death Penalty in America: Current Controversies. Oxford: Oxford University Press, 1997.
    • Despite its publication date, this anthology is still quite useful. It is the best, basic reference for primary and secondary source materials related to American death penalty law, constitutional issues, Supreme Court decisions, public attitudes, social scientific studies of deterrence, and explorations of procedural problems with capital punishment, including matters of race.
  • Bedau, Hugo Adam. Killing as Punishment: Reflections on the Death Penalty in America. Boston: Northeastern University Press, 2004.
    • Bedau has long been a prominent philosophic scholar specializing in research and writing about capital punishment in the United States. The first half of this volume is primarily descriptive of the American system, including problematic procedural outcomes and some recent history of the death penalty. The second half of the book “undertakes a critical evaluation…from a constitutional and ethical point of view.” As a matter of applied ethics, Bedau argues for abolition of the death penalty in reasonably just, constitutional democracies, such as the United States.
  • Black, Charles L., Jr. Capital Punishment: The Inevitability of Caprice and Mistake. Second edition. New York: Norton, 1981.
    • Written by a legal scholar, an accessible appeal to problematic outcomes of American criminal procedure as justification for abolishing the death penalty.
  • Caplan, Arthur A. “Should Physicians Participate in Capital Punishment?” Mayo Clinic Proceedings 82 (2007): 1047-48. http://www.mayoclinicproceedings.org/article/S0025-6196(11)61363-3/fulltext
  • Conway, David A. “Capital Punishment and Deterrence: Some Considerations in Dialogue Form.” Philosophy & Public Affairs 3 (1974): 431-443.
  • Davis, Michael. “Harm and Retribution.” Philosophy & Public Affairs 15 (1986): 236-266.
  • Duff, R. A. Punishment, Communication, and Community. Oxford: Oxford University Press, 2001.
  • Dworkin, Gerald. “Patients and Prisoners: The Ethics of Legal Injection.” Analysis 62 (2002): 181-189.
  • Feinberg, Joel. “The Expressive Function of Punishment. Doing and Deserving. Princeton: Princeton University Press, 1970. 95-118.
  • Feinberg, Joel. “Noncomparative Justice.” Rights, Justice, and the Bounds of Liberty: Essays in Social Philosophy. Princeton: Princeton University Press, 1980. 265-306.
  • Finkelstein, Claire. “A Contractarian Approach to Punishment.” The Blackwell Guide to the Philosophy of Law and Legal Theory. Ed. Martin Golding and William Edmundson. Oxford: Blackwell Publishing, 2005. 207-220.
  • Finkelstein, Claire. “A Contractarian Argument Against the Death Penalty.” New York University Law Review 81 (2006): 1283-1330.
  • Gaie, Joseph B.R. The Ethics of Medical Involvement in Capital Punishment: A Philosophical Discussion. Dordrecht: Kluwer Academic Publishers, 2004.
  • Hampton, Jean. “The Moral Education Theory of Punishment.” Philosophy & Public Affairs 13 (1984): 208-238.
  • Hart, H.L.A. “Bentham and Beccaria.” Essays on Bentham. Oxford: Clarendon Press, 1982. 40-52.
  • Hart, H. L. A. “Prolegomenon to the Principles of Punishment.” Punishment and Responsibility: Essays in the Philosophy of Law. Oxford: Clarendon Press, 1968. pp. 1-27.
    • This essay remains hugely influential in providing the dominant framework for philosophic theories of punishment, including the death penalty.
  • Hart, H.L.A. “Punishment and the Elimination of Responsibility.” Punishment and Responsibility: Essays in the Philosophy of Law. Oxford: Clarendon Press, 1968. pp. 158-185.
  • Heyd, David. “Hobbes on Capital Punishment.” History of Philosophy Quarterly 8 (1991): 119-134.
  • Litton, Paul, Physician Participation in Executions, the Morality of Capital Punishment, and the Practical Implications of Their Relationship (June 28, 2013). 41 Journal of Law, Medicine, & Ethics 333 (2013); University of Missouri School of Law Legal Studies Research Paper No. 2013-13.  https://ssrn.com/abstract=2286788.
  • Mackenzie, Mary Margaret. Plato on Punishment. Berkeley: University of California Press, 1981.
  • McGowen, Randall. “The Death Penalty.” The Oxford Handbook of the History of Crime and Criminal Justice. Edited by Paul Knepper and Anja Johansen. Oxford: Oxford University Press, 2016. 615-634.
  • Montague, Phillip. Punishment as Societal Defense. Lanham: Rowman & Littlefield, 1995.
  • Morris, Herbert. “Persons and Punishment.” The Monist 52 (1968): 475-501.
  • Murphy, Jeffrie. “Marxism and Retribution.” Philosophy & Public Affairs 2 (1973): 217-243.
  • Nathanson, Stephen. An Eye For An Eye? The Morality of Punishing by Death. Second edition. Totowa, NJ: Rowman & Littlefield, 2001.
    • An accessible, readable argument to the conclusion “that the death penalty is not morally acceptable.” Nathanson considers a variety of arguments offered in defense of capital punishment in America: deterrence, costs, problematic procedural outcomes, moral desert and the death penalty, American constitutional considerations. An especially helpful treatment of the arguments based on criminal procedure in America.
  • Nathanson, Stephen. “Does It Matter if the Death Penalty Is Arbitrarily Administered?” Philosophy & Public Affairs 14 (1985): 149-164. Print.
  • Nozick, Robert. Anarchy, State, & Utopia. New York: Basic Books, 1974.
    • Chapter 4 deals with theories of punishment (retributive and deterrence) with respect to a contractarian theory of a libertarian state developed in the spirit of John Locke’s emphasis on individual rights.
  • Nozick, Robert. Philosophical Explanations. Cambridge: Harvard U P, 1981.
    • Section III of Chapter 4 (pp. 363-398) deals with punishment as communication, including some ambivalence about its implications for the death penalty for murderous offenders.
  • Nussbaum, Martha. “Equity and Mercy.” Philosophy & Public Affairs 22 (1993): 83-125.
  • Pojman, Louis. “For the Death Penalty.” The Death Penalty: For and Against. Lanham, MD: Rowman & Littlefield, 1998. 1-66.
  • Pojman, Louis, and Jeffrey Reiman. The Death Penalty: For and Against. Lanham, MD: Rowman & Littlefield, 1998.
    • Distinctly different, opposing, nuanced approaches to the death penalty in the context of more general theories about punishment and illustrating ways in which justifications are often hybrid theories that synthesize elements of retributivism and consequentialism. Both authors also address the import of imperfect criminal procedures in the administration of the death penalty in America (or perhaps anywhere). The text includes a response by each to the other’s arguments.
  • Quinn, Warren. “The Right to Threaten and the Right to Punish.” Philosophy & Public Affairs 4 (1985): 327-373.
  • Radin, Margaret Jane. “Cruel Punishment and Respect for Person: Super Due Process for Death.” Southern California Law Review 53 (1980): 1143-1185.
  • Rawls, John. A Theory of Justice. Revised edition. Cambridge: Harvard University Press, 1971, 1999.
  • Reiman, Jeffrey. “Justice, Civilization, and the Death Penalty: Answering van den Haag.” Philosophy & Public Affairs 14 (1985): 115-148.
  • Reiman, Jeffrey. “Why the Death Penalty Should be Abolished in America.” The Death Penalty: For and Against. Lanham, MD: Rowman & Littlefield, 1998. 67-132.
  • Schabas, William. The Abolition of the Death Penalty in International Law. Third edition. Cambridge: Cambridge University Press, 2002.
    • An excellent survey of the title topic, an aspect of capital punishment not often engaged in the work of others in this list.
  • Royal Commission on Capital Punishment 1949-1953.: Report. Cmd.8932. London: Her Majesty’s Stationery Office, 1953.
  • Simmons, A. John. “Locke and the Right to Punish.” Philosophy & Public Affairs 20 (1991): 311-349.
  • Sorell, Tom. “Aggravated Murder and Capital Punishment.” Journal of Applied Philosophy 10 (1993): 201-213.
    • An excellent analysis of the arguments of John Stuart Mill and Immanuel Kant in defense of capital punishment for at least some murders.
  • Sorell, Tom. Moral Theory and Capital Punishment. Oxford: Basil Blackwell in association with the Open University, 1987.
    • Though the primary aim of this book is to show how philosophic arguments and theories “can be useful in producing an improved moral rhetoric,” Sorell does offer a non-consequentialist and retributivist defense of capital punishment on the ground that murderers deserve to die. He opposes alternative forms of retributivism (e.g., appeals to fairness) and argues that utilitarian or consequentialist arguments are inconclusive, including J.S. Mill’s little-known defense of capital punishment.
  • Stalley, R.F. An Introduction to Plato’s Laws. Indianapolis: Hackett, 1983.
  • Ten, C.L. Crime, Guilt, and Punishment. Oxford: Clarendon Press, 1987.
    • A clear, organized introduction to an array of recent theories of punishment, though not specifically addressed to issues of capital punishment. Chapter 7, “The Amount of Punishment,” engages retributivist and utilitarian approaches to justifying the form or kind of punishment for offenders.
  • United Nations. “The Universal Declaration of Human Rights.” (1948). http://www.un.org/en/universal-declaration-human-rights/.
  • United Nations. “International Covenant on Civil and Political Rights.” (1976). http://www.ohchr.org/en/professionalinterest/pages/ccpr.aspx.
  • United States. House of Representatives. The Constitution of the United States of America. Washington: Government Printing Office, 2000. https://www.gpo.gov/fdsys/pkg/CDOC-110hdoc50/pdf/CDOC-110hdoc50.pdf.
  • Waisel, David. “Physician Participation in Capital Punishment.” Mayo Clinic Proceedings 82 (2007): 1073-1080. http://www.mayoclinicproceedings.org/article/S0025-6196(11)61369-4/fulltext.

 

Author Information

Robert Hoag
Email: bob_hoag@berea.edu
Berea College
U. S. A.

Lao Sze-kwang (Lao Siguang) (1927—2012)

photo courtesy of The Chinese University of Hong Kong

The works of Lao Sze-kwang (Lao Siguang) cover a wide range of philosophies, including Confucianism, Buddhism, Daoism, Kantianism, Hegelianism, and, most importantly, the philosophy of culture. Like many other Chinese philosophers of the 20th century, Lao was personally affected by the Chinese Revolution of 1949 and made his career outside of mainland China, having first fled to Taiwan and then Hong Kong after the victory of Mao Zedong’s Communist forces in China’s civil war. Along with other modern Chinese philosophers, Lao was deeply interested in the problem of China’s modernization and actively participated in politics. These two aspects of his intellectual biography, in turn, help to define his work as a philosopher, which focused on contextualizing Chinese philosophy in relation to Western thought as well as emphasizing the practical, as opposed to purely theoretical, dimension of Chinese philosophy.

In his multi-volume New Edition of the History of Chinese Philosophy (1984-1986), he tried to reconstruct traditional Chinese philosophy with the help of modern Western philosophy but in ways that both resembled and differed from the work of so-called “New Confucians” such as Mou Zongsan and Tang Junyi. This work highly influenced the study of Chinese philosophy in Hong Kong and Taiwan and helped shape several generations of Chinese scholars’ understanding of traditional Chinese thought. Unlike Feng Youlan or Hu Shi, each of whom authored competing histories of Chinese philosophy, Lao attempted to define Chinese philosophy not in terms of what it is (an essentialist approach) but in terms of what it does (a functionalist approach). For Lao, Chinese philosophy functions primarily in an “orientative” manner, shaping and guiding Chinese people’s values at a deep level rather than merely advocating particular propositions or theories. In this way, Lao thought, Chinese philosophy was distinct from other social philosophies, particularly those of the West.

Having summarized the legacies of both traditional Chinese and modern Western thought, Lao later developed a philosophy of culture that preoccupied him from the 1990s until his death in 2012. In his Lectures on Philosophy of Culture (2003), Lao declared that his philosophy was driven by a cultural crisis of consciousness, as he realized that he grew up in a frustrating age in which traditional Chinese culture was declining in influence while modern culture was yet to be established as a new cultural order. Lao criticized both traditionalist and anti-traditionalist approaches to the modernization of Chinese culture and tried to develop his own approach.

Table of Contents

  1. Biography
  2. On Chinese Philosophy
    1. New Edition of the History of Chinese Philosophy
    2. The Fundamental Question Method
    3. Chinese Philosophy as an Orientative Philosophy
  3. On Philosophy of Culture
    1. Cultural Spirit as Self-Consciousness of Value
      1. Categories of the Self
      2. Moral Subjectivity in Confucianism and the Rejection of Metaphysical Interpretation
      3. “Eastern Spirit” vs. “Western Spirit”
    2. Modernization of Chinese Culture
      1. The Problem of Objectivity
      2. The Problem of Traditionalism
      3. The Problem of Anti-Traditionalism
  4. Criticisms and Influence
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Lao Sze-kwang was born in Xi’an, capital of Shaanxi province, in 1927 to a highly educated military officer’s family. His grandfather scored high on the Qing dynasty’s (1644-1912) imperial civil service examination and was appointed Governor General of Liangguang (modern Guangxi and Guangdong provinces). In 1860, he negotiated Britain’s lease of the Kowloon peninsula and the New Territories following China’s defeat in the Second Opium War (1856-1860). Throughout his childhood, Lao received a traditional Chinese education at home and even composed his first classical poem at the age of seven.

In 1946, Lao entered the Department of Philosophy at Peking University but fled to Taiwan in 1949 when the Communist Party overtook mainland China. He graduated from the Department of Philosophy at Taiwan University in 1951. As a liberal, however, Lao had difficulty tolerating the Kuomintang (KMT) military dictatorship in Taiwan headed by Chiang Kai-shek. To protect him from political persecution in Taiwan, Lao’s father—a general who served under Chiang—asked him to flee to Hong Kong in 1955. In 1964, Lao joined Chung Chi College at the Chinese University of Hong Kong as a lecturer of philosophy. Lao’s colleagues in Hong Kong included Mou Zongsan and Tang Junyi. As opposed to his pro-KMT colleagues, who condemned only the Communist regime in mainland China, Lao condemned both the KMT’s and the Chinese Communist Party’s dictatorships and refused to join Mou’s and Tang’s “New Confucian” campaign. Lao was later promoted to Senior Lecturer, Reader, and finally Head of the Philosophy Department at the Chinese University of Hong Kong. Although Lao officially retired in 1985, he continued to serve as Honorary Senior Research Fellow at the Institute of Chinese Studies and as Senior College Tutor at Shaw College.

In 1989, Lao returned to Taiwan, where the process of democratization had begun. In 1994, Lao became Chair Professor at the Department of Philosophy of Huafan University. The government of the Republic of China (Taiwan) awarded Lao the National Cultural Award in 2001 and the Republic of China Academia Sinica Fellowship in 2002. In 2012, Lao died in Taiwan.

2. On Chinese Philosophy

a. New Edition of the History of Chinese Philosophy

From 1984 to 1986, Lao published his most influential work, New Edition of the History of Chinese Philosophy. It consists of four volumes of a systematic reconstruction of Chinese philosophy, from the pre-Qin period (the period before the establishment of the Qin dynasty in 221 B.C.E.) to the Qing dynasty (1644-1912). Lao regarded it as the first complete work on the history of Chinese philosophy because, according to him, all previous writings about the history of Chinese philosophy focused on its history rather than on the reconstruction of its philosophical problems. While Lao acknowledged Huang Zongxi (1610-1695)’s Record of the Ming Scholars as work that satisfied his criteria for an authentic history of Chinese philosophy, it merely addressed the development of Confucian thought during the Ming dynasty (1368-1644). Similarly, Lao did not consider Hu Shi (1891-1962)’s History of Chinese Philosophy as an authentic study of the history of Chinese philosophy because, in his view, it focused too much on archaeological investigation rather than philosophical discussion. For Lao, a valid work on the history of Chinese philosophy should reconstruct the theories of past philosophers by investigating their philosophical questions and answers. Lao argued that although Feng Youlan (1895-1990)’s History of Chinese Philosophy explored the philosophical problems of past Chinese philosophers, it failed to grasp the nature of Chinese philosophy as the pursuit of virtue (that is, the complete actualization of humanity’s innate moral goodness) due to the overwhelming influence of Western metaphysical concepts and Marxist ideology on Feng’s thought. Thus, Lao refused to acknowledge Feng’s book as a work on Chinese philosophy. Instead, Lao argued that his own book was the first true account of the history of Chinese philosophy because of its accurate reconstruction of the philosophical questions and answers of past Chinese philosophers.

b. The Fundamental Question Method

In New Edition of the History of Chinese Philosophy, Lao suggested a new methodology for the study of Chinese philosophy called the Fundamental Question Method. Lao argued that every school of philosophy chooses a single ultimate question to answer. Such a question is known as “the fundamental question,” for which the procedure is as follows:

  1. Investigation of a philosophical text and its historical background
  2. Reconstruction of the text’s arguments
  3. Deduction of the original intention behind the arguments
  4. Identification of the fundamental question
  5. Reconstruction of the logical relations between the questions and answers on the basis of the fundamental question

In order to answer such questions, philosophers provide several answers or solutions, which may lead to new sub-questions. The question-answering process is the development of a philosophical school. A fundamental question leads to several levels of sub-questions with their corresponding answers. All of these questions and answers construct a complete theory. The problem is that some philosophers may not explicitly declare their own fundamental question. Therefore, a “theoretical reduction” is needed. One may deduce the philosopher’s original intention from the arguments found in his writings.

It is important to note that the Fundamental Question Method does not resist the introduction of Western philosophical concepts to the study of Chinese philosophy. Instead, Lao emphasized that his Fundamental Question Method employed a “Western logical analysis” due to the relative absence of logic and epistemology in the Chinese tradition. To articulate the fundamental questions of Chinese philosophies, one must inevitably employ Western disciplines such as hermeneutics and logic. Lao used a microscope analogy to justify his support of the use of logical analysis in the study of the history of Chinese philosophy. Although the microscope was invented by the Europeans in the modern era, it can still be used as an instrument to study ancient bacteria in Africa, which existed long before their discovery under the microscope. Likewise, logical analysis is a “microscope for thoughts.” Principles discovered through logical analysis have existed since before its introduction. Spatiotemporal differences do not weaken the universal validity of principles discovered through logical analysis, for logic is a universal science. The following section demonstrates Lao’s application of the Fundamental Question Method to Confucius’ philosophy as articulated in New Edition of the History of Chinese Philosophy.

Lao identified the Analects as the only reliable source for studying Confucius’ philosophy, as it was allegedly composed by Confucius’ students. According to the Records of the Grand Historian, written by Sima Qian (145-86 B.C.E.), Confucius was employed as an official ritual expert by the ancient state of Lu (modern Shandong province). For Confucius, the ceremonies and literature of the Western Zhou dynasty (1046-771 B.C.E.) provided the basis for all proper and efficacious ritual (li), but by his time, these had been forgotten, diluted, or abused as a result of China’s increasingly fractious political and military environment. As Lao pointed out, the Confucian view of li goes beyond mere ceremony to encompass internal attitudes within ritual actors and participants, especially a sense of reverence for tradition and an aesthetic sensitivity to beauty as a mode of order. As a way of restoring harmonious order to a disorderly and divided society, li was vital to Confucius’ understanding of the purpose of both collective culture and personal self-cultivation. Therefore, Lao considered the concept of li as the starting point of Confucius’ philosophy.

However, Lao did not think that li itself was the central concept of Confucius’ philosophy. According to him, Confucius’ concept of li points at a deeper principle: social order. The distinction between li as the guarantor of social order and mere ritual was already being discussed before Confucius developed his philosophy. In the Zuozhuan (Zuo Commentary on the Spring and Autumn Annuals), an ancient Chinese chronicle, when the Duke of the state of Jin praised the Duke of Lu for being good at keeping li in archery performance, his minister, Ru Shuqui, disagreed and distinguished li from deportment, or the following of rituals: “[li] is that by which [a ruler] maintains his State, carries out his governmental orders, and does not lose his people.” Lao coined the term quanfen (division of power and responsibility) to describe what Neo-Confucian philosophers called lifen (division by ritual, that is, reason and duty)—a sense of social order achieved by the division of society into class-specific titles and duties. It is this sense of social order on which Confucius’ philosophy was based, according to Lao, and toward which li as a kind of social performance and inner experience was to be aimed.

What is the legitimacy of li? According to Lao, Confucius reduced li to yi (righteousness) and ren (humaneness or benevolence). Confucius said that “[t]he gentleman takes rightness as his substance, puts it into practice by means of ritual [li], gives it expression through modesty, and perfects it by being trustworthy. Now that is a gentleman” (Analects 15:18). Righteousness stands for the normative principles used to distinguish between right and wrong. Ren is the basis of righteousness, as Confucius said a person practising ren is free from wickedness (Analects 4:4). Ren is the highest moral goodness. Therefore, li is ultimately reduced to ren. Confucius clearly argued that ren constitutes the restraint of the self for the return to li (Analects 12:1). Li is the way of practising ren. But, what is ren? In Lao’s Essential of Chinese Culture (1998), he argued that ren is the foundation of righteousness, devotion to public well-being (gongxin), and purification of intention. Ren implies the denial of self-interest, for righteousness contrasts with narrow self-interest (Analects 4:16). Once a person has denied his self-interest, he will pursue public well-being and righteousness.

To sum up, Lao argued that Confucius’ fundamental question was, “How can we preserve the social order?” As such, it leads to the question, “What is the legitimacy of li?” By reducing the concept of li to yi (righteousness) and ren (humaneness or benevolence), Lao approached Confucius’ answer systematically: the legitimacy of li comes from ren. Once a person practices li, he actualizes ren as devotion to public well-being.

c. Chinese Philosophy as an Orientative Philosophy

In his 1989 article “On Understanding Chinese Philosophy: An Inquiry and a Proposal,” Lao defined Chinese philosophy as an Orientative Philosophy and tried to convince the Western scholar to consider Chinese thoughts as a philosophical tradition. He introduced a simplified version of the Fundamental Question Method for the reconstruction of Chinese philosophies, which he called the “purpose theory.”

Firstly, Lao argued that philosophy is reflective thinking about certain functions of philosophy. Reflective thinking occurs when a person reflects on his own activities. Epistemology and hermeneutics reflect on the nature of human knowledge and understanding, while ethics reflects on the nature of human moral action. Metaphysics reflects on the underlying unity of the empirical world. The subject matter of reflective thinking in different periods and places can be radically different. When reflective thinking addresses a certain type of subject matter, it provides particular philosophical solutions to a particular philosophical problem. Therefore, to understand a particular philosophy, one must understand the problems with which it deals. If the problems of a particular philosophy have no relevance whatsoever to real life, said Lao, it should be rejected.

Secondly, Chinese philosophy is an orientative reflective thinking that intends to transform the self or the world. According to Lao’s definition, an orientative philosopher should suggest that there is a final purpose in life and that people should try to actualize such purpose in their daily living. Different schools of Chinese philosophy provide different normative guidelines or regulations for daily life. Lao named such guidelines “purpose-theories,” which contain three steps: (1) selecting a purpose, (2) justifying the purpose, and (3) offering practical maxims for people actualizing such a purpose.

Lao used the work of the Daoist philosopher Zhuangzi and the Confucian philosopher Mencius (Mengzi) as examples. Zhuangzi’s purpose was to achieve xiaoyao, which means “absolutely unburdened and unbound freedom” of the mind or the self. Lao identified xiaoyao as “transcendent freedom” because it exerts no influence on the objective world. In order to justify his purpose, one must understand Zhuangzi’s view of the world. According to Zhuangzi, the world is changing. He used the term hua to indicate the concept of change. The physical self is illusory, as “every empirical existence is in a relation of transformation with other existences. The elements that constitute my body did constitute, and will constitute, other things in the same time/space structure, or empirical world.” The “body is no more than a congregation of physical elements… which will disintegrate when the elements move to form other physical things” (Lao 1989, 280). The principle governing all changes in the world is known as zaohua (meaning “making changes”). In other words, the real self cannot be anything physical that changes restlessly. Instead of defining the non-physical self as the “real self,” Zhuangzi simply did not consider the real self as an object. The real self is beyond all beings and gets rid of self-limiting inclinations. The real self should not fall into the realm of beings, or else it would be limited by changes. Zhuangzi even denied the existence of anything valuable in the physical world. He rejected cultural values as limiting one’s freedom, for knowledge and values exist only within a system in which the criterion of truth is relative. There is no universal criterion of truth that transcends all theories and systems, according to Zhuangzi. The endless debates among philosophies, like the debate between Confucianism and Mohism, can never manifest the truth. Only the transcendent freedom of the self is valuable, according to Zhuangzi. One must become enlightened to enjoy transcendent freedom. Lao, however, argued that Zhuangzi did not provide a practical maxim that could teach people how to become enlightened.

In contrast to Zhuangzi’s purpose of the transformation of self, Lao saw Mencius’ purpose as the transformation of the world by creating a cultural order actualized by li (ritual propriety), yi (righteousness), and ren (humaneness or benevolence). In order to justify his purpose, Mencius established a doctrine of mind and essence (xin xing lun). He argued that the real root of all moral and cultural values is within human nature. As long as one can maintain the mastery of the mind over the body, one can act morally. According to him, there are “four beginnings (sishanduan),” which are also known as the four basic qualities of the mind: “The sprout of humaneness or benevolence [ren],” “the sprout of righteousness [yi],” “the sprout of ritual propriety [li],” and “the sprout of wisdom [zhi]” (Mencius 2A6). Xing, which can be translated as human nature or essence, is universal to every human being. The four beginnings are four innate moral capacities. Thus, the legitimacy of moral and cultural values and orders is determined by such universal virtues. A righteous government is one in which the rulers love the people through the virtue of ren. The legitimacy of authority lies on the will of the people. For practical maxims, Mencius emphasised not only self-transformation but also social transformation. Social transformation can only be achieved by a virtuous leader who fully actualizes his innate moral capacities. Self-transformation, however, can simply be achieved by the enlightenment. As long as a person is conscious of his innate moral capacities and actualizes them, he achieves self-transformation. Using Zhuangzi and Mencius as examples, Lao argued that Chinese philosophies, as orientative philosophies, aim to answer the question of “where to go” instead of “what it is” (Lao 1989, 290). Both Zhuangzi and Mencius tried to provide some directions for how one ought to live.

Overall, Lao’s purpose theory is merely a simplified version of the Fundamental Question Method. With the Fundamental Question Method, one must identify the fundamental questions, sub-questions, and answers so as to reconstruct the system of a particular Chinese school of philosophy. With the purpose theory, however, one only needs to identify a Chinese philosopher’s purpose with justification. One does not need to investigate how a fundamental question leads to several secondary questions and answers.

3. On Philosophy of Culture

While Lao is famous for his contribution to the study of the history of Chinese philosophy, his major research interest lies in Chinese cultural issues—namely, the philosophy of culture. Despite Lao’s deep interest in Chinese thought, his philosophy of culture draws heavily from German idealism, especially Hegel’s doctrine of “externalization,” which postulates that just as human beings come to perceive the world as alien and even hostile because it is “other,” not human—because the observed world is different from the human self that observes—so too can human beings reconcile themselves to the world by realizing that the world really is not “other” in relation to themselves.

In Lectures on Philosophy of Culture, Lao distinguished philosophy of culture from cultural critique or cultural studies. Having adopted Habermas’ trichotomy (distinction between philosophy, critique, and science) from the essay “Between Philosophy and Science—Marxism as Critique,” Lao argued that philosophy is merely theoretical, science is merely epistemically judgmental (according to experience), and critique is both theoretical and judgmental (Lao 2002, 42).

In order to understand Lao’s take on philosophy of culture, one must understand his definition of the term “philosophy of culture.” According to him, philosophy of culture is both descriptive and normative. As a descriptive philosophy, it aims to describe the nature or essence of a particular culture, while as a normative philosophy, it aims to evaluate the orientations or trends of cultural development. The term “culture” has two meanings: phenomenon and spirit. Human activities that express meanings and values are cultural phenomena, while the systems of those inner meanings and values are cultural spirits. Anthropology and social sciences only study cultural phenomena. Only philosophy investigates the underlying values and meanings behind that phenomena. Cultural spirit, which is “value consciousness,” determines the modes of cultural phenomena. When human beings are conscious of the existence of values and meanings and try to manifest them in human activities—including ideas, attitudes, systems, and customs—they are actualizing their cultural spirits. A cultural spirit is determined by free will (Lao 1998, 6-7). Therefore, culture is defined as a spirit or as “the actualization of value consciousness,” namely, the process of manifesting values in human activities. Because the subject matter of philosophy of culture is the cultural spirit, it is appropriate to look at Lao’s narratives of Chinese cultural spirit to understand his particular philosophy of culture.

a. Cultural Spirit as Self-Consciousness of Value

i. Categories of the Self

Subjectivity is an essential concept in Lao’s philosophy. Lao categorized the self into the moral self, the cognitive self, and the aesthetic self in terms of the “field of subjective activities,” namely, the mind activity of the self, manifesting itself in the external world against the limitations of the body. He calls this the “trichotomy of the self.” In terms of the numbers of the self, Lao provided another distinction: the “single subject” and the “multiple subjects.” Such distinction is known as the “dichotomy of the subject” (Lao 2000, 219).

In order to understand the concepts of cognitive self, moral self, and aesthetic self, one must understand the nature of the “subjective mind activity.” According to Lao, mind activity tries to manifest itself in the external world and to achieve self-actualization. To achieve self-actualization, one must overcome the physical limitation imposed by the body and the external world. In other words, self-actualization is a struggle for freedom against limitation. The body or the physical self of a person is not his real self, for the body is determined by external factors instead of one’s free will. The body is mechanical and determined by conditions; when the hand feels the heat of the fire, it immediately moves away from the fire.

The cognitive self is manifested as cognitive judgment. Cognitive judgment aims to analyze or reflect on experience so as to produce certain knowledge. Cognitive judgment is objective and universally valid. The facts that 1 + 1 = 2 and that H2O is water are both universally and objectively true. The cognitive self is the subject undertaking such cognitive judgment. Such self-freedom is not complete freedom, for that cognitive judgment is nonetheless limited by external conditions, which are experiences. The self has no dominion over the experience it receives. Free will does not affect the validity of the judgment.

The moral self is manifested as ethical judgment. Ethical judgments are value judgments. Value judgments are unconditional. When one says that killing is morally wrong, one means that killing is always wrong. Therefore, ethical judgments are also universal. However, the objects of value judgments must be free human beings. When person A considers his own action C1 as being morally right or wrong, he assumes that C1 is an intentional action done by himself according to his free will. In other words, person A assumes that he has dominion over his own action C1 and is responsible for his own actions. The self, manifested in value judgment, therefore, is always dominant over some actions (Lao 1998, 143).

The aesthetic self-manifests itself in aesthetic or emotional judgment. Unlike the moral self or the cognitive self, the aesthetic self brings no universal judgment. Aesthetic judgment, like sexual desires or appetite, is merely about preferences. When person B makes an aesthetic judgment that he wants an object or a state of affair x, he assumes that his desire for x is intentional, but his desire for x is never universal. Someone from Hong Kong may want to drink Hong Kong-style milk tea for breakfast this morning but British-style Earl Grey tea tomorrow morning. A person may fall in love with different people at different times and places. More importantly, “B wants x” is only valid for person B alone. In other words, the validity of an aesthetic judgment is subjective. Person B “clings to” object x where such satisfaction depends on external conditions, for the satisfaction of desire is not determined by one’s own free will. Therefore, the dominion and the freedom of an aesthetic self is very limited.

ii. Moral Subjectivity in Confucianism and the Rejection of Metaphysical Interpretation

Confucianism, according to Lao, emphasizes only the moral self. Like his contemporaries, Lao acknowledged that subjectivity—being a moral subject itself—is essential to Confucianism. Mencius’ doctrine of mind and essence affirmed that everyone shares the innate moral goodness to act morally in the form of “four beginnings.” One does not need to learn how to be a moral person. One need only to acknowledge the fact that human nature is good, and one needs to actualize his own moral capacity by following certain teachings. According to Lao, Mencius was the first person to argue that the origin of moral values and virtues, namely, the self-consciousness of one’s innate moral capacity, is internal instead of external. However, the Han dynasty Confucians were ignorant of Mencius’ xin xing lun and instead defined heaven as the external source of moral values with the help of yin yang cosmology. Lao criticized the Han Confucians for being too “metaphysical” (Lao 1984, vol. 2, 10). Nonetheless, the later Confucian thinkers known as “Neo-Confucians” gradually moved from cosmology to the xin xing lun. The only exception is the Lu-Wang school, which argued that the mind itself is the innate moral principle (xin ji li, “the mind is the reason”) is closer to Mencius’ philosophy than the ChengZhu school, which argued that the heavenly command is the supreme moral principle (xing ji li, “the essence is the reason”) (Lao 1984, vol. 3a, 489).

Because Confucians argued that the origin of moral value is internal to every human being, they emphasized the priority of the moral self or of subjectivity. The moral self transcends from the bodily limitation and postulates moral principles according to its own innate moral capacity. The moral self-manifests freedom through moral actualization or virtue completion. The social and cultural orders are not sources of moral values. Rather, they are the instruments that help every individual to actualize his own moral capacity. Confucian emphasis on the moral self leads to a Confucian doctrine of culture: that all cultural phenomena should manifest the internal moral values in every individual. Such a statement is essential to Lao’s analysis of the nature of traditional Chinese culture, which is strongly shaped by Confucian ethics.

iii. “Eastern Spirit” vs. “Western Spirit”

Lao contrasted the “Eastern spirit” with the “Western spirit” by claiming that the Eastern spirit is virtue-oriented while the Western spirit is wisdom-oriented. In Collection of Essays on Cultural Problems (2000), Lao argued that the wisdom-oriented spirit originating from ancient Greek philosophy is the orthodox cultural spirit in the West, unlike the faith-oriented Hebrew spirit (Lao 2000, 30). A wisdom-oriented spirit is a spirit that pursues objective knowledge. The Eastern spirit, however, is a virtue-oriented spirit, which pursues moral order. Easterners aim to establish a proper way of living. Both the Chinese cultural spirit and the Indian cultural spirit are virtue-oriented Eastern spirits. The Chinese cultural spirit, however, emphasises moral actualization and virtue completion, while the Indian cultural spirit emphasises renunciation. The Chinese cultural spirit is dominated by the Confucian spirit (Lao 2001, 218), which aims to construct a “reasonable” or “proper” social order according to human nature (Lao 2000, 48).

According to Lao, the ancient Chinese—unlike Westerners—emphasized social order more than knowledge about the external world. The “reasonableness” of social order is not about objective evidence. Instead, it is about its coherence with innate human nature. As discussed above, Confucius reduced li as social order into yi (righteousness) and ren (humaneness or benevolence). The legitimacy or the moral value of social order does not depend on an external God, monarch, or objective form. Instead, it depends on ren alone, which is internal to the moral self. Li is the actualization of the virtue of ren. The rituals and order between parents and children manifest the filial and family love, while those between monarchs and ministers manifest teamwork, solicitude, and loyalty. As a result, the moral self becomes the only source of all moral and cultural values. The legitimacy of moral and cultural values is not demonstrated by experiences or reasoning.

Under the Chinese cultural spirit, the aesthetic self and the ethical self are submissive to the moral self. Because, for Lao, the ultimate concern in Chinese culture is virtue completion or actualization, knowledge and arts are merely instruments for moral practices. For example, poems and verses should manifest proper moral values. Architecture is merely used for the sake of improving people’s well-being, as in the construction of China’s Great Canal and Great Wall. Arts and sciences are not used for their own sakes. They are all used for the sake of moral actualization.

Having distinguished the Chinese cultural spirit from other cultural spirits, Lao argued that the Chinese emphasis on social order led to a different perspective on interpersonal relationships. Moral actualization requires the actualization of the innate human nature of the moral self instead of other selves. Only a single self participates in moral actualization. Moreover, the Chinese cultural spirit only emphasises the role of the moral self. Lao did not consider the cognitive self and the aesthetic self as independent from the moral self, as all cognitive and aesthetic activities are merely instruments of moral actualization. Therefore, there is only one self in the Chinese cultural spirit, namely, the moral self.

b. Modernization of Chinese Culture

i. The Problem of Objectivity

Based on his philosophy of culture, Lao established his political philosophy and provided criticism of traditional Chinese culture. As discussed above, for Lao, the Chinese cultural spirit is dominated by the Confucian spirit, which only emphasises the actualization of the moral self as a single self, instead of multiple and equal selves. In Lao’s Liberty, Democracy and Cultural Creation (2001), he explained why democracy and modern natural science are absent from traditional Chinese culture.

There are two kinds of interpersonal relations according to Lao, namely, coordinative relations (bing li guan xi) and hierarchical relations (ceng ji guan xi). A coordinative relation is an equal relation among individuals, while a hierarchical relation is unequal (Lao 2001, 224). In order to deal with certain “public affairs,” individual members of the society gather under a particular relation. If the members gather in a coordinative relation, acknowledging each member as an equal individual, they may develop a democracy. However, if they gather in a hierarchical relation, they may have a monarchy or an aristocracy.

The dominance of the moral self leads to a hierarchical relation, for the moral self suppresses the cognitive and aesthetic selves. The authority of the moral self cannot be challenged by the cognitive or aesthetic selves. The cognitive self cannot question the legitimacy of virtue and rituals, as their legitimacy does not rely on reason or empirical evidence. The moral self is a single individual who does not seek anything outward, for it has already attained all innate virtues. The Confucian emphasis on the moral self leads to a hierarchical relation, which is the social and cultural order according to the principle of li.

The absence of coordinative relations in the Chinese cultural spirit implies the absence of democracy and modern natural science. Lao argued that the virtue-oriented Chinese cultural spirit does not acknowledge inter-subjectivity, as the moral self is a single self suppressing all other selves. Without a coordinative relation, equality is impossible. Scientific knowledge is objective knowledge, which is verifiable or falsifiable by empirical evidence. In other words, the authority of scientific knowledge does not depend on any individual but on the objective evidence to which everyone has equal access. Scientific knowledge assumes the concept of equality and coordinative relations. If a scientist S1 verifies that theory T is true with experiment E, other scientists like S2, S3, S4 … and Sn should be able to verify theory T with the same experiment E. There are multiple subjects who are cognitive selves having equal access to the same method and the same knowledge, regardless of their social status.

Social status and roles, however, are important in the Confucian concept of virtue completion or moral actualization. Confucius argued that a monarch should behave like a monarch, a minister should behave like a minister, a father should behave like a father, and a son should behave like a son. The hierarchical order determines the particular ways of virtue completion for different people. A son has no access to his father’s moral practices and vice versa. A minister acting as a virtuous monarch is vicious for having seized the power from the monarch. One cannot actualize his innate virtue if he refuses to act according to his social role.

Hierarchical relations emphasise authority. Education and entertainment are distributed to different people according to their social statuses. A peasant should not play royal music in his home as it is not proper. A student should not condemn his teacher’s teaching openly as it is impolite. The overemphasis on authority prevents the development of modern natural science, which constantly falsifies previous theories. Hierarchical relations also prevent the development of democracy due to the denial of equality.

In short, Lao criticized the Chinese cultural spirit for its overemphasis on the moral self, which suppresses the cognitive and aesthetic selves and denies the existence of a coordinate interpersonal relation. Lao assumed that the Chinese cultural spirit is dominated by the Confucian spirit, which aims to actualize the internal and innate virtues of ren. Although he did not provide a concrete solution to cure the “sickness of Chinese culture,” he argued that if Chinese culture is to be modernized, it should acknowledge the existence of multiple subjects and a coordinate relation and develop an independent cognitive self. Lao claimed that most contemporary Chinese scholars failed to diagnose the problem of Chinese culture. This led to two extreme perspectives on the modernization of Chinese culture: traditionalism and anti-traditionalism. Lao condemned both perspectives as one-sided misinterpretations of the Chinese cultural spirit, as explained below.

ii. The Problem of Traditionalism

Traditionalists are people who argue for the full restoration and preservation of traditional Chinese culture against Western cultural invasion. Traditionalists do not argue against Chinese modernization. Rather, traditionalists argue that there are precious values within traditional Chinese culture that must be preserved. Traditionalists do not reject Western thoughts. Instead, they argue that the introduction of Western thought must be based on the preservation of essential traditional Chinese values. Such a perspective is known as zhongti xiyong (“Chinese [thought for] fundamentals and Western [thought for] practical application”) (Lao 2003, 104). For example, the early 20th century Chinese reformer Liang Qichao (1873-1929) argued that Western constitutional monarchies are consistent with traditional Confucian ethics, while Mou Zongsan tried to reinterpret Confucianism with the help of Kantian ethics.

According to Lao, traditionalists generally follow a Hegelian model of culture, according to which “[internal] values determine the activities which are manifested as [external] systems” (Lao 2003, 104-105). In the Manifesto on the Reappraisal of Chinese Culture (1958), “new Confucian” philosophers—including the aforementioned Mou Zongsan and Tang Junyi—reconstructed traditional Chinese culture by articulating how Confucian values determine the Chinese cultural phenomenon. Tang even argued that conservative attitudes can be progressive. Progression must be based on value consciousness: how to manifest values in contemporary situations (Tang, 1974, 24). The overseas Chinese refugee must be confident with traditional Chinese culture and “re-root oneself spiritually” in order to preserve traditional Chinese values (Tang 1974, 49). However, Lao questioned Tang and his fellow “new Confucians” by asking why Confucian and traditional Chinese values are worthy of preservation and manifestation in the contemporary world. Lao argued that if traditional Chinese culture needs to be restored or preserved, as new Confucians argued, then traditional Chinese culture had already declined. So why did traditional Chinese culture decline? And why is traditional Chinese culture worthy of being preserved? Someone taking Tang’s position might reply by arguing in favour of the unique features of Chinese Confucian values, for example the xin xing lun, and blaming the Western invasion for the decline of traditional Chinese culture. As discussed before, however, Lao argued that Confucian overemphasis on moral self and the suppression of cognitive and aesthetic selves led to that cultural decline. Natural science and democracy failed to develop within traditional Chinese culture until the Western invasion arrived. For Lao, this amounted to a devastating critique of the traditionalist arguments advanced by Tang and his colleagues. Still, this did not mean that Lao embraced the anti-traditionalist view, either.

iii. The Problem of Anti-Traditionalism

As opposed to traditionalists, anti-traditionalists are people who reject the value of traditional Chinese culture entirely in order to achieve Chinese modernization. Hostility towards traditional Chinese culture was the mainline ideology in early 20th century China. After the First Opium War (1839-1842), the Qing dynasty suffered from invasions and interventions by the West. Chinese people were aware of the weakness of traditional Chinese culture and tried to modernize China through Westernization. After the overthrow of the Qing dynasty and founding of the Republic of China in 1912, the imperial system collapsed, which challenged the hierarchical order of traditional Chinese culture. The May Fourth New Culture Movement in 1919 blamed traditional Chinese culture for being a barrier to Chinese modernization. This movement assumed that modernization counteracts traditional Chinese culture.

To evaluate anti-traditionalists’ criticism of traditional Chinese culture, Lao reconstructed the anti-traditionalist argument as follows:

  1. Traditional Chinese culture is a barrier to Chinese modernization.
  2. Confucianism is an influential origin of traditional Chinese culture.
  3. To achieve modernization, one must oppose traditional Chinese culture and therefore reject Confucianism (Lao 2003, 59).

Furthermore, to demonstrate that traditional Chinese culture is a barrier to Chinese modernization, one must demonstrate the truth of two statements: that traditional Chinese culture is influential enough to prevent Chinese modernization and that all factors preventing Chinese modernization stem from traditional Chinese culture. Lao argued that both statements can hardly be substantiated. Traditional Chinese culture was very weak in modern Chinese society after the Opium Wars. The republic replaced the imperial monarchy in the 1912 revolution, and modern written Chinese replaced classical written Chinese in the 1919 May Fourth New Cultural Movement. Both the Chinese Communist Party and the KMT, while bitterly opposed to one another, engaged in critiques of traditional Chinese culture (Lao 2003, 59-60).

Lao argued that, while there are counterexamples that reject the first statement, it is difficult to justify the second statement. There are numerous factors preventing Chinese modernization that may come not from traditional Chinese culture, but from humans’ simple animal nature, namely, instincts like selfishness, fear, and so on. People’s fear of change may prevent social reform, and rulers’ ambitions of power may prevent democratic reform. Traditional Chinese culture may have nothing to do with these factors. Therefore, it is unfair to blame traditional Chinese culture for preventing Chinese modernization. Besides, while the Confucian spirit dominates the traditional Chinese cultural spirit, Confucianism is not the only influential origin of Chinese values. Daoism and Buddhism are more influential at the popular level of society than Confucianism. Even if traditional Chinese culture were the major barrier to Chinese modernization, it is unfair to argue that Confucianism is also a barrier, as the traditional Chinese culture is not identical with Confucianism (Lao 2003, 62).

Lao criticized anti-traditionalists for their failure to acknowledge modernization as imitation or learning, rather than destroying one’s own traditional cultural heritage. A native speaker of Cantonese cannot learn English as his second language until he can speak his native language. When he learns English as a second language, he does not give up Cantonese as his native language. Learning is the process of obtaining new abilities based on existing capacity. Without existing capacity, it is difficult to learn any new skill (Lao 2003, 73).

4. Criticisms and Influence

Lao’s writings make it clear that he tended to overemphasize the importance of Confucianism in characterizing Chinese culture at the expense of other traditions, especially Buddhism and Daoism. In Essentials of Chinese Culture, Lao expressed his bias against the Daoist religion. He argued that the Daoist religion has little impact on political systems and moral teachings, as the Daoist religion “is not believed by the scholars” (Lao 1998, 180).

Moreover, Lao assumed that Chinese scholars or philosophers—most of whom were social elites—determined the nature of the Chinese cultural spirit. Given that since the 11th century or so, most Chinese elites have embraced Confucianism, for Lao it seemed obvious that Confucianism defines the Chinese cultural spirit. However, as continental philosophers such as N. F. S. Grundtvig have pointed out, it is questionable whether a cultural spirit is defined by scholars alone. Lao seemed to assume that the working class and folk religion play only small roles in cultural development. However, if a cultural spirit bears the true meaning and values behind all cultural phenomena, one should observe how those cultural phenomena are manifested in reality and how community members interpret a particular cultural phenomenon.

Additionally, Lao’s definition of culture as a “cultural spirit” is problematic. If culture is defined as a cultural spirit, which is value consciousness, it means that all members share and manifest the same values in their cultural behavior, for as long as they are conformists. However, the fact is that it is possible for members of the same culture to interpret the same cultural phenomenon with different values. An uneducated peasant may have no knowledge about the deep meanings behind the rituals of ancestor worship or a traditional Chinese marriage ceremony. He may merely follow the customs and habits without reflection. Alternatively, he may interpret ancestor worship in a very different way from the standard Confucian interpretation endorsed by Confucian scholars. While Confucian scholars interpret the ritual of ancestor worship as a way to show respect and commemorate ancestors, a peasant may practice ancestor worship as a way to ask ancestors to bless his family. Different subgroups, economic classes, or families within the same community may have vastly different interpretations of the same cultural phenomenon. Differences in value consciousness imply different cultural spirits. Thus, the definition of culture as cultural spirit or value consciousness is problematic, as it is very difficult to verify what values are essential to a particular culture.

Agreeing with the idea that the Confucian spirit dominates the Chinese cultural spirit and that the latter emphasises virtue completion or moral actualization, it is nonetheless unclear why the dominion of the moral self in the Chinese cultural spirit implies a hierarchical relation among individuals. The moral self’s suppression of the aesthetic and cognitive selves does not signify the single self’s suppression of the other selves. A real individual self is a union of the aesthetic self, the cognitive self, and the moral self. The relation between the aesthetic, cognitive, and moral selves should be distinguished from the relation between individual selves. The former can be a suppression within a single individual self, but the latter is a suppression among different individual selves, namely, an individual who oppresses other individuals. Undoubtedly, Lao confused the real individual self with the moral self. More importantly, as Confucianism acknowledges that everyone has the innate moral capacity to achieve virtue completion, it should be able to acknowledge the equality of human beings. Why, then, did classical Confucianism not acknowledge such equality but instead develop a hierarchical relation among people?

Furthermore, when it comes to the role of Confucian values in traditional Chinese culture, Lao seems to contradict himself. On one hand, Lao himself realized that Confucian values have a limited influence on traditional Chinese culture when he distinguished the opposition against traditional Chinese culture from the opposition against Confucianism in his discussion of anti-traditionalism. On the other hand, Lao maintained that the Confucian cultural spirit dominates traditional Chinese culture, assuming that Chinese scholars determined the structure of traditional Chinese culture while Daoism and Buddhism were less influential.

Finally, despite its having been influenced by Western thought, Lao’s philosophy of culture contained a certain bias against Western culture, which weakened his discussions on transcultural dialogue between Chinese culture and Western cultures. Lao displayed no interest whatsoever in Christian contributions to such transcultural dialogue, nor did he acknowledge Christianity’s influence on the modernization of Chinese culture. Considering the fact that Lao spent many years teaching at Chung Chi College, a Protestant Christian institution in which discussions of dialogue between Christianity and Confucianism were frequent and enthusiastic, it is surprising that he has so little to say about Christianity. He devoted only a few pages of The Essentials of Chinese Culture to summarizing the history of Christianity in China. Without offering any evidence, Lao argued that Christianity “has yet to infiltrate the cultural life of the Chinese nation” and that Chinese people have little passion for the Christian faith (Lao 1998, 191).

5. References and Further Reading

a. Primary Sources

  • Kang de zhi shi lun yao yi [Essential of Kant’s Theory of Knowledge]. Hong Kong: Union Press, 1974.
  • Li shi zhi cheng fa [The Punishment of History]. Hong Kong: University Life Ltd., 1971.
  • Zhongguo zhi lu xiang [China’s Way Out]. Hong Kong: Wisdom Publishing, 1981.
  • Xin bian zhong guo zhe xue shi [New Edition of the History of Chinese Philosophy]. Taipei: San Min Book Co. Ltd, 1984-1986.
  • “On Understanding Chinese Philosophy: An Inquiry and a Proposal,” in Understanding the Chinese Mind: The Philosophical Roots, ed. Robert E. Allinson (Hong Kong: Oxford University Press, 1989), 265-293.
  • Zhongguo wenhua yao yi xin bian [Essentials of Chinese Culture]. Hong Kong: The Chinese University Press, 1998.
  • Wen hua wen ti lun ji xin bian [Collection of Essays on Cultural Problems]. Hong Kong: The Chinese University Press, 2000.
  • Wen hua zhe xue yan jiang lu [Lectures on Philosophy of Culture]. Hong Kong: The Chinese University Press, 2003.
  • Xu jing yu xi wang: lun dang dai zhe xue yu wen hua [Illusion and Hope: On Contemporary Philosophy and Culture]. Hong Kong: The Chinese University Press, 2003.
  • Dang dai xi fang si xiang de kun ju [The Dilemma of the Contemporary Western Thoughts]. Taipei: Commercial Press Taiwan, 2014.

b. Secondary Sources

  • Mou Zongsan, Xu Fuguan, Zhang Junmai, Tang Junyi, and Xie Youwei. “Manifesto on Behalf of Chinese Culture Respectfully Announced to the People of the World: Our Joint Understanding of Sinological Study and Chinese Culture with Respect to the Future Prospects of World Culture,” trans. Eirik Lang Harris. Hackett Publishing, 2018.
  • Shen, Vincent. “Obituary of Lao Sze Kwang.” Journal of Chinese Philosophy 40/1 (2013): 215-217.
  • Tam, Andrew Ka Pok. A Discourse on Hong Kong Culture. Hong Kong: Passion Times, 2016.
  • Tang Junyi. Shuo zhong hua min zu zhi hua guo piao ling [On the Falling Flower and the Withering Fruit of the Chinese Nation]. Taipei: San Min Book Co. Ltd, 1974.

 

Author Information

Andrew Ka Pok Tam
Email: k.tam.1@research.gla.ac.uk
University of Glasgow
United Kingdom

Edward Jonathan Lowe (1950-2014)

E. J. LoweEdward Jonathan Lowe (usually cited as E. J. Lowe) was one of the most significant philosophers of the twentieth and early twenty-first century. He made sustained and significant contributions to debates in metaphysics, ontology, philosophy of mind, philosophy of language, philosophical logic, and philosophy of religion, as well as contributing important scholarly work in early modern philosophy (most notably on Locke).

Over the length of his career, Lowe published eleven single-authored books, four co-edited collections, and well over 300 papers and book reviews in journals and edited volumes. The range of topics covered in his published work is highly eclectic. Given this, and his prolific rate of publication, this article cannot aim to cover all of the questions that Lowe contributed work on. Instead, it will focus on some of his most significant contributions in metaphysics and ontology, and related topics in other areas of philosophy.

This choice of focus stems, in part, from Lowe’s strong belief in the inescapability of metaphysical questions. Lowe argued for the need to approach metaphysics, and philosophy more broadly, in a serious, systematic fashion, likening metaphysics to putting together the pieces of a gigantic jigsaw puzzle, working with, rather than trying to overrule or being secondary to, natural science.

Although the sections in this article focus on different topics, the highly systematic nature of Lowe’s work means that there are many potential points of intersection that could be drawn between them. In the interests of providing a navigable summary of Lowe’s work, this article highlights only some of these connections.

Table of Contents

  1. Biography
  2. What Is Metaphysics?
    1. The Science of the Possible
    2. The Science of Essence
    3. Metaphysics, and Logic and Language
    4. Metaphysics and Common Sense
  3. Ontology
    1. The Four-Category Ontology
    2. Objects
    3. Properties
    4. Universals
    5. Kinds
    6. Further Formal Ontological Relations
      1. Exemplification
      2. Identity
      3. Composition
      4. Constitution
    7. Persistence and Change
      1. Endurantism vs Perdurantism
      2. Persistence and Intrinsic Change
  4. Essence
    1. What Are Essences?
    2. Modality and Essence
    3. Categoricalism
  5. Mind, Persons, and Agency
    1. The Non-Identity of Mental and Physical States
    2. Non-Cartesian Substance Dualism (NCSD)
    3. The Unity Argument for NCSD
    4. Mental Causation
    5. Agent Causation
  6. Other Work
  7. References and Further Reading
    1. E. J. Lowe
    2. Other References

1. Biography

Lowe was born in Dover, England, on 24 March 1950. He went to Cambridge to study Natural Sciences in 1968, changing to History after one year and was awarded a BA (first class) in 1971. Lowe switched to studying philosophy and moved to Oxford. He was awarded his BPhil and DPhil degrees in 1974 and 1975 (supervised by Rom Harré and Simon Blackburn respectively). After briefly teaching at the University of Reading, Lowe moved to the University of Durham in 1980, where he would stay for the rest of his career until his death in 2014.

2. What Is Metaphysics?

In the preface to The Possibility of Metaphysics, Lowe states that his ‘overall objective in this book is to help to restore metaphysics to a central position in philosophy as the most fundamental form of rational inquiry, with its own distinctive methods and criteria of validation’ (1998: iii). This section outlines Lowe’s view on what metaphysics is, how it relates to other areas of research and inquiry, and why metaphysics is, for Lowe, ‘unavoidable’. Understanding the inevitability of metaphysical inquiry, and the relationship of metaphysical research to other areas (including physics and the other natural sciences, but also to ‘common sense’ and ordinary perception) is crucial to understand Lowe’s motivation to defend various first-order metaphysical positions. As such, whilst important in its own right, the significance of Lowe’s views about these metametaphysical issues may only become clear later in this entry, once we begin to grapple with the first-order issues.

a. The Science of the Possible

For Lowe, metaphysics has dual characterisations: as the science of the possible and the science of essence.

As the science of the possible, Lowe does ‘not claim that metaphysics on its own can, in general, tell us what there is. Rather—to a first approximation—I hold that metaphysics by itself only tells us what there could be’ (1998: 9; see also 2006a: 4–5, 2011a: 106; 2007b; 2008a; 2008b; Ms.). Metaphysics is, in part, the process of charting the domain of objective or real possibility, which Lowe holds, is ‘an indispensable prerequisite for the acquisition of any empirical knowledge of actuality’ (2011a: 100). That is, in metaphysics and ontology we explore how things might be—what is possible and compossible (what things could co-exist). This enquiry into the possible ways reality might be, in conjunction with empirical work, can allow us to get at what is actually the case for we must, for Lowe, understand what is possible before we can understand what is actual. In this way, metaphysics becomes indispensable, as a way to illuminate the features of reality that empirical scientific enquiry presupposes, but must be combined with that empirical enquiry to arrive at a full account of how reality is.

This claim about the science of the possible also leads Lowe to a position about the methods of metaphysics, holding that metaphysics’ method ‘is first to argue, in an a priori fashion, for the possibility—and compossibility—of certain sorts of things and then to argue, on partly empirical grounds, for the actuality of some of those things that are compossible’ (2011a: 105). Metaphysics is a holistic enterprise, not to be done in a piecemeal way, as the attempt to understand what things exist and, just as crucially, co-exist.

Lowe’s conception of metaphysics is not divorced from experience and empirical data. There is no clear boundary for Lowe between the work of the metaphysicians and that of the theoretical sciences. But this is not to say that there is not a distinctive role for the philosopher. For Lowe, ‘science presupposes metaphysics… Empirical science at most tells us what is the case, not what must or may be (but happens not to be) the case. Metaphysics deals in possibilities’ (1998: 5).

Lowe’s view holds that metaphysics, or more precisely ontology, comes in two parts: ‘one which is wholly a priori and another which admits empirical elements’ (2006a: 4). The a priori part of ontological theorising is best taken to be that part of metaphysics that is the ‘science of the possible’ described above. That is, the a priori part of ontology explores the realm of genuine metaphysical possibility, and what things could co-exist in a single possible world.

Note that the use of ‘possible world’ here is not intended to invoke a commitment to the concrete reality of possible worlds. Lowe rejects Lewis’ modal realism, denying that possible words, whether they exist or not, are objects (1988: 256). Rather, ‘possible world’ here is only used as a phrase to highlight that we can produce a number of theories that seek to describe how reality is, and call each of them a possible way that reality could be. The a priori part of ontology is thus devoted to exploring those possible ways that reality might be.

The ‘empirically conditioned’ part of ontology seeks ‘to establish, on the basis of empirical evidence and informed by our most successful scientific theories, what kinds of things do exist in this, the actual world’ (2006a: 4–5). Given that metaphysics is, in part, the science of the possible, we can see that for Lowe metaphysics differs in both its subject matter and methodology from the empirical sciences, but crucially the two exist ‘in a symbiotic relationship, in which each complements the other (2011a: 102; see Morganti and Tahko 2017).

By holding that one aspect of ontology is (predominantly) a priori, ontology is methodologically distinct from the empirical sciences. By holding that its subject matter is genuinely possible ways reality could be, its subject matter is distinct as empirical science does not concern itself with how reality could be, only with how it is. But crucially, as ontology has two aspects, and two tasks, it overlaps in one of those tasks with the empirical sciences. This is what gives rise to a truly symbiotic relationship, avoiding many of the issues that arise in other accounts that seek to give priority (epistemic, or otherwise) to either the empirical sciences or to metaphysics.

However, it also brings into focus why, for Lowe, no science can provide the map of reality. Natural sciences are focused on restricted domains, and on what is actual, but grasping what is actual requires us first to know what is possible (2006a: 4). Metaphysics is unavoidable, essential, and cannot be rejected (despite the various arguments that have attempted to do so). For Lowe, metaphysics provides the foundation for natural science, and without that grasp on what is possible, we cannot have knowledge of what is actual, nor come to recognise the implicit (or explicit) assumptions within natural science (see Mumford and Tugby 2013).

b. The Science of Essence

In parallel with the above, as the science of essence, Lowe takes metaphysics to be the task of saying what some entity is such that it is that entity—to provide the real definition for that entity (as opposed to the verbal definition; see Fine 1994). To enquire into the real definition of an entity is to attempt ‘to characterise, as perspicuously as possible, the nature or essence of some actual or possible being’ (2007a). Lowe takes this interest in real definition and ‘essence’ from Aristotle. For example, a characterisation of the essence of a circle—‘a perspicuous way of saying what it is, or would be, for something to be a circle’ (2007a)—is to be the locus of a point moving continuously at a fixed distance around another point. This is what it is, or world be, for something to be a circle.

This focus on essence has meant that Lowe is commonly listed as a key figure in the recent resurgence in ‘neo-Aristotelean’ approaches to metaphysics, taking metaphysics not to be primarily concerned with what exists (as in the neo-Quinean tradition), but rather with the essence of those types of entities that do exist, and the metaphysical relations that hold between them.

This conception of neo-Aristoteleanism should be distinguished from another conception, as discussed by Schaffer (2009). Under this alternative conception, neo-Aristoteleans need not accept essences into their ontology, but they do share the focus on how entities are related to each other, rejecting the neo-Quinean focus on what exists (for more on the neo-Aristoteleanism that Lowe endorsed, see Lowe 2013c, Novotný and Novák 2014, Tahko 2012).

We comment more specifically on Lowe’s notion of essence in a later section. However, it is important to see the links that exist for Lowe between metaphysics as the science of the possible, and metaphysics as the science of essence. To elucidate the essential nature of an entity is to provide the existence and identity conditions for that entity, or that kind of entity. The essence is what dictates what that entity is, or would be.

Note the ‘or would be’ in this account. Lowe is clear throughout his work that he is investigating what entities or kinds of entities would be like independently of whether any of them do actually exist. This is not to say that Lowe thinks that he is engaging in some conceptual analysis around the notion of, say, a circle. Rather it is to say that given that metaphysics is about what is possible, we must understand what it would be for something to be a circle so that we can then consider whether reality does in fact contain anything that fits that real definition. This again shows the connection between metaphysics as the science of the possible and metaphysics as the science of essence.

c. Metaphysics, and Logic and Language

Another domain that is important to highlight in understanding the role and importance of metaphysics is that of language and formal logic. This is particularly the case given the central role in much of the analytic philosophy tradition given to first-order predicate logic with identity. Lowe is clear in his rejection, not of such logic per se, but its assumed dominance, and the types of ontological claims and distinctions that arise from this logical system. For example, Lowe rejects what Smith called ‘Fantology’, the view that the ‘key to the ontological structure of reality is captured syntactically in the “Fa” […] of first-order logic, where “F” stands for what is general in reality and “a” for what is individual (Smith 2005: 1; see also Smith 1997; Lowe 2013a: chapter 4).

The central problem with Fantology, for Lowe, is that it equips us with ‘a certain conception of reference and predication which is, from the point of view of serious ontology, extremely thin and superficial’ (2013a: 50). First-order predicate logic with identity only provides a restricted formal machinery that only allows for ontological distinctions between objects and properties, and between existence and identity. These distinctions are most certainly present in Lowe’s ontology; however, there are many more in addition to these two.

A further problem with Fantology comes from its adherents holding to Quine’s maxim that ‘to be is to be the value of a variable’. Lowe holds that ‘∃’ should be analysed as the ‘particular quantifier’ rather than as an existential quantifier. By so doing, the particular quantifier can quantify over non-existent objects, without having to accept Meinong-like distinctions. For expressing existence, Lowe prefers the use of a monadic existence predicate, ‘E!’. This logical machinery, he argues, better suits the ontological framework that he defends, and thus is to be preferred (see 2013: chapter 4).

This brings us to the main point for Lowe with respect to logic, and language. Understanding language and logic is important, and he does on occasion use arguments from natural language in particular to highlight ontological distinctions (for example between categorical and dispositional predication; see 2013a: chapter 5). But language, and logic, mislead. It is central to Lowe’s philosophical theorising that the hard work of ‘serious ontology’ must come first, and that ontological conclusions cannot be read off of our language or our logic. This is what motivated Lowe’s adoption of a version of sortal logic instead of first-order predicate logic. It is not that sortal logic is intrinsically ‘better’. It is that a version of sortal logic allows Lowe to express the ontological distinctions that he believes exist, which cannot be expressed perspicuously with the tools of first-order predicate logic (2006: chapter 4).

Therefore, in what follows when commenting on Lowe’s first-order views, it should be stressed that arguments that might initially seem to derive from grammatical or semantic points to ontological conclusions are not of the form: language expresses facts in this way; therefore, we should adopt the corresponding ontology. Instead, the move always has to be from ontology to a correct language or logical system.

This does not rule out that some distinctions appear in our natural language, in part, due to those distinctions being indicative of corresponding distinctions in reality (Lowe, personal communication). For example, the grammatical distinction between subject and object might exist in our language because there is a relevantly similar distinction in reality between objects and properties. This is not though to read the distinction off of our language; the case for an ontological distinction between objects and properties must stem from ontological rather than linguistic arguments.

d. Metaphysics and Common Sense

As a last point on the more meta- or methodological parts of Lowe’s work, it is important to note the commitment to common sense in Lowe’s metaphysics. Common sense, for Lowe, is the starting point for many metaphysical and philosophical problems. All else being equal, Lowe often appeals to solutions that are the ‘least revisionary’ either with respect to how we perceive the world, or how we typically talk about the world. Coherence with common sense should be retained if possible, and only rejected if ‘moving away’ from common sense yields significant theoretical advantage. Metaphysics will not always follow common sense, but it can be our starting point, when combined with a respect for science that resists scientism. This sensitivity to common sense becomes further apparent in various places throughout the rest of this article.

For example: on the tensed view of time, ‘for what it is worth, I consider it to be a distinct merit of the tensed view of time that it delivers this verdict, for it surely coincides with the verdict of common sense’ (1998: 104); on intrinsic change, ‘it seems to me that if we have to accept one or other of these three solutions to the semantic problem of intrinsic change, then we had better opt for solution (ii), as this is clearly the least revisionary with respect to our common-sense talk of persistence through change’ (1998: 130); on predicates and properties, ‘the idea behind the proposal is the seemingly common-sense one that the property of being F is what all and only the Fs have in common’ (2006: 122); on four dimensionalism about objects, ‘I have grave doubts about the ultimate coherence of this view of things, suspecting that what superficial plausibility it possesses is parasitic upon our prior grasp of the very neo-Aristotelian or “common-sense” conception which it seeks to challenge’ (2009: 18); and on Quine’s ontological relativism, ‘it is not one that should be contemplated as long as the prevailing common-sense ontological scheme can be defended as viable, as I believe it can’ (2009: 90).

This acceptance of a role in our metaphysics for common sense is not to deny Lowe’s view that metaphysics should be approached as the study of the fundamental nature of reality in a serious and steadfastly realist way. Rather, it is to say that for Lowe it is not the case that metaphysicians have some infallible insight on eternal truths, insulated from the human perspective that otherwise might distort our claims. Metaphysics seeks to understand the nature of reality, whilst accepting that any claims about reality will be made from a particular perspective. We have a relation to reality, but ‘that we cannot stand outside ourselves to study that relation need not imply that it cannot be studied by us at all’ (1998: 4).

This last point serves as the basis on which Lowe rejects what he calls the neo-Kantian objection to metaphysics (2001: 4). Lowe argues that we are ourselves part of reality, and so are our thoughts. This means that claims that knowledge of how things really are is impossible are foundered on a contradiction. Metaphysics, and metaphysicians, must be ‘critical’ (2001: 5). Metaphysics may involve refining concepts, but this is to make those concepts more reflective of reality. That we have a particular viewpoint on the world does not stop this from being possible; rather, it just means that we must be careful to ensure that we are suitably critical.

3. Ontology

a. The Four-Category Ontology

At the heart of Lowe’s metaphysical (and much of his broader philosophical) work is his defence of a four-category ontology. This was developed over a long period of time, with its most extensive exposition in the 2006 book named for it. This ontology, Lowe argues, best allows for a balance between explanatory power and ontological parsimony, and, along with the equally central notion of ‘essence’, provides the basis for a unified account of a wide range of phenomenon (as it becomes clear in the remainder of this article).

(Note that in earlier work (1989) Lowe defended a three-category ontology but came to believe that an additional category was needed, and theoretically justifiable despite the additional ontological cost. Later, Lowe argues that ‘persons’ may be a further (fifth) fundamental category of entities (2008a). The status of persons is discussed in section 5.)

The four-category ontology explicitly takes its inspiration from the early work of Aristotle, most centrally in the Categories. As Lowe interprets Aristotle:

[Aristotle] articulates a fourfold ontological scheme in terms of the two technical notions of ‘being said of a subject’ and ‘being in a subject’. Primary substances […] are described as being neither said of a subject nor in a subject. Secondary substances—the species and genera to which primary substances belong—are described as being said of a subject but not in a subject. That leaves two other classes of items: those that are both said of a subject and in a subject, and those that are not said of a subject but are in a subject. Since these two classes receive no official names and have been variously denominated over the centuries, I propose to call them, respectively, attributes and modes. (2012a: 97)

Put into Lowe’s preferred terminology, the four-category ontology thus emerges from the intersection of two exhaustive and exclusive ontological distinctions: the first between entities that are substantial and non-substantial (that is, properties or relations), and the second between entities that are universal and particular. This leads to the view that all entities (actual and possible) are assigned to one of the following ontological categories: object (substantial particular), kind (substantial universal), attribute (non-substantial universal), mode (non-substantial particular).

A note on this terminology: Lowe preferred the terminology of ‘mode’ as he took inspiration (and the term) from Locke for his category of particular non-substantial entities. These are property- or relation-instances, and are elsewhere, including by Lowe, called ‘tropes’, ‘abstract particulars’, or ‘individual accidents’.

All four of these categories are equally basic or fundamental. Terms such as ‘universal’, ‘particular’, and ‘entity’—the all-encompassing category that all entities, both universal and particular, belong to—are taken to be transcategorial as they apply to entities from multiple categories.

The four fundamental categories are related to each other through patterns of instantiation and characterisation relations. A particular object is an instance of a kind—a particular tiger is an instance of the kind tiger; and a particular mode is an instance of the non-substantial universal or attribute—the particular redness, say of a particular ball, is an instance of the non-substantial universal redness. The instantiation relation thus tracks the distinction between universal and particular.

The characterisation relation holds along the other dimension of the four-category ontology, between the substantial and non-substantial. A particular redness characterises the particular substance whose redness it is, and the non-substantial universal (attribute) redness characterises the substantial universal (kind) tomato.

Taken together, these categories and relations can be summarised in what Lowe called the ‘Ontological Square’ (2006a: 18):

It should be stressed that these relations are not further elements of being. That is, the ‘relations’ of instantiation and characterisation are strictly ‘formal’. This connected to, but strictly not the same as, the form/content distinction in logic: ‘The ontological form of an entity is provided by its place in the system of categories, for it is in virtue of a being’s category that it is suited or unsuited to combine in various ways with other beings of the same or different categories’ (2006a: 48). Instantiation and characterisation (and other relations discussed below) are thus not relational properties—they are formal relations that illustrate how those entities they relate are. This means that formal relations metaphysically explain the nature of those entities they relate, without those relations themselves being further things (compare the notion of internal relation drawn from Moore 1919).

That these formal relations are not further elements of being also has an additional benefit for Lowe’s system in that it avoids the possible threat of Bradley’s Regress. Bradley’s Regress arises when we consider what explains the claim that objects and properties (or bundles of properties) are related (see Bradley 1893). If we conceive of such relations as distinct from their relata, then we would need to posit further relations to relate them to the original relata, and so on ad infinitum. The formal nature of instantiation and characterisation for Lowe ensures that this problem does not arise. The formal ontological relations are not distinct from the relata they relate, and hold purely in virtue of the existence and intrinsic nature of the relata.

Similarly, the categories themselves are formal and are not further elements of being. Rather the categories indicate the ontological form of the entities that fall under that category, and how those entities that fall under distinct categories are related to each other. The ontological categories therefore do not themselves exist.

The main reason for the non-existence of the categories themselves is that all entities that do exist must fall under one of these categories, but the categories themselves cannot be so analysed. In brief, the categories cannot be universals, as universals have particular instances as their kinds—if the categories were universals, then they would have to have universals (such as the kind dog) as instances. One way out of this is to posit the category of kinds as being a particular—say a set (an abstract particular object). However, this immediately raises the problem of requiring that this set—the set of the categories—is a member of itself. This is a sufficient problem for Lowe for him to reject this possibility.

One further possibility is to take the categories to be ‘higher-order’ universal, and therefore has the first-order categories as its instances. However, and leaving aside Lowe’s general reluctance to accept higher-order universals, the higher-order universals that the different categories would belong to would have to be different under Lowe’s system: the category of kinds would be a second-order universal as its instances are other kinds that are themselves universals; whilst the category of objects would be a first-order universal as its instances are particular substances. Given that the categories would not in fact be of the same order, they would not actually be the same kind, and therefore categories cannot be higher-order universals (2006a: section 3.3; see also Griffith 2015, Miller 2016).

To be clear, this is still a realist account of ontological categories, despite the categories not themselves being elements of being:

An object is different from a property or a mode in virtue of the intrinsic natures of these entities, quite independently of us and our ways of describing or thinking of things. We place things in different ontological categories correctly if we distinguish them rightly in respect of these intrinsic and objective differences between them. (2006a: 43–44)

As discussed in more detail below, we categorise correctly if we correctly account for the existence and identity conditions of an entity—or the essence of that entity—which will be in line with which of the categories the entity falls under.

For example, it is part of the essence of a mode that it depends for its existence and identity upon the object that it characterises and that it is an instance of an attribute. All modes are intrinsically different from all entities that fall under other categories due to these mind-independent existence and identity conditions.

The categories and the relations that hold between them create various forms of asymmetrical dependence relations. Particular modes are ontologically dependent on the particular substance that they are a mode of. Indeed, for Lowe, a mode can only be the mode that it is if it is a way that that particular substance is. For example, the particular mode of redness of a particular apple cannot characterise any other particular object. However, that particular object (the apple) could have been characterised by a different mode. Therefore, the particular object is only weakly dependent on the modes that characterise it, whilst the modes are strongly dependent on the particular objects that they characterise.

Similar asymmetrical dependency relations hold between the other categories. Some non-substantial universal (say, the attribute redness) is weakly dependent on the particular modes that are instances of it. The attribute redness would still exist if all of the actually existing redness modes did not exist, just so long as at least one redness mode did exist. The same is not the case for a particular mode, and so the mode is strongly dependent on the existence of the attribute that the mode is an instance of.

The role of dependency relations within this ontological system is important. However, Lowe holds that dependence is not so much a single relation as a family of relations, including, at least, rigid existential dependence, non-rigid existential dependence and also identity dependence. Dependence, though genuine, is not fundamental, but rather is ‘founded’ upon other formal ontological relations that are more ontologically basic (see Lowe 1994; Tahko and Lowe 2015).

It should be stressed that the non-fundamental nature of dependency should not lead us to think it is unimportant. The variety of dependency relations, and the ‘founded’ nature of dependence, allows for a wide range of intricately distinct dependence relations of differing modal and metaphysical strengths. This feature becomes clear throughout the whole of this article by discussing a range of formal ontological relations, and the key role that they play in differentiating entities and categories of entities.

Beyond the fundamental categories, Lowe argues that a complete metaphysical picture of the world will contain further categories, which are interrelated in a hierarchical structure. This allows Lowe to say that there are both more general but non-fundamental categories (‘substantial’, ‘non-substantial’, ‘entity’, ‘universal’, and ‘particular’), and less general non-fundamental categories (such as ‘concrete objects’ and ‘events’). This acceptance of a hierarchy of entities means that Lowe is not committed to the claim that there only exist fundamental entities. Instead, clearly non-fundamental entities (such as money, or a dog) can exist, and the task of the metaphysician is, in line with neo-Aristotelean claims, is to map how such entities are related to each other. It is the case that all of the fundamental categories are occupied, but ‘there is plenty of scope to debate whether or not various subcategories of those basic ones are filled in actuality’ (2006a: 44). Lowe provides an illustrative example of the hierarchical system he ‘favours himself’, but does note it as a ‘partial sketch’ at (2006: 8).

As noted in passing in the initial description, it is important to stress that Lowe sees this categorial scheme as applying to both actual and possible entities, and suggests that it is the role of the empirical sciences, not philosophers, to decide what entities are actual:

Metaphysics should not be in the business of dictating to empirical scientists precisely how they should categorise the theoretical entities whose existence they postulate. Metaphysics supplies the categories, but how best to apply them in the construction of specific scientific theories is a matter best left to the theorists themselves, provided that they respect the constraints which the categorial framework imposes. (2006a: 19)

b. Objects

Lowe defends the idea that particular objects cannot be mere bundles of properties (either of tropes or of universals), nor should be thought of as some mixture of a ‘mysterious substratum’ (2006a: 28, see also 2000b). Thus the view here is that objects are an irreducible and basic category of entity, which, as part of their essential nature, perform a ‘supporting-role’ for particular property-instances. Thus:

According to my conception of objects, an object is not a complex which is somehow constituted by a collection of particular properties together with some further entity which is itself neither a particular property nor a propertied object. The mistake is to suppose that an object is even partially constituted by its particular properties, as this inverts the true direction of ontological dependency between object and property. Particular properties are no more (and no less) than features or aspects of particular objects, which may indeed be selectively attended to through a mental process of abstraction when we perceive or think of particular objects, but which have no being independently of those objects and which consequently cannot in any sense be regarded as ‘constituents’ of objects. In this respect, the particular properties of an object differ radically from its parts, if it has any, for these are just further objects with particular properties of their own. (2006a: 97)

Justification for this view—for the additional posit in our fundamental ontology of particular substances or objects—comes largely from what Lowe sees as flaws or confusions in the competing views.

On the bundle theory, Lowe thinks that the problem is that if we take that view, we cannot provide adequate identity-conditions for property-instances: ‘Property-instances are ontologically dependent entities, depending for their existence and identity upon the individual substances which they characterize, or to which they “belong”’ (2006a: 27). They cannot ‘float free’ from an individual substance, as properties are ‘ways that objects are’.

On ‘substratum’ views, we are in danger of being committed to ‘bare particularism’, ‘or to the notion of a property-less “substratum” that somehow “supports” and “unites” the properties of a single object’ (2006a: 27). The mistake here is to think of individual substance as complex entities that are composed of a non-propertied substratum and some properties. Rather, Lowe thinks that particular substance are simple fundamental entities, that are weakly dependent on property-instances—in that all particular substances are some way, and so must be characterised by at one property-instance. But this does not make them complex, nor make the individual substance’s properties items that compose that individual substance. This also does not mean that objects cannot be composite. Some objects, living organisms for instance, may be made up of lesser substantial parts.

Together, these motivations lead Lowe to hold that we have reached ‘explanatory bedrock’ in the concept of ‘substance’ or ‘object’, and thus that we should accept the category of ‘individual substance’ into our ontology.

c. Properties

Some further comment is required on Lowe’s views about properties, particularly given his commitment to the existence of both universal and particular properties.

Properties are ways of being, or ways that objects are (2006: 90–91). The particular property of ‘redness’ thus is a way that some object is, and the universal property of redness is a way that more than one object is, such that those objects can be said to be the same colour. This means that properties are not objects as, in line with the above, they cannot exist independently. Properties are strongly dependent on objects, but objects only weakly so on properties.

Relational properties are non-formal and are also taken to be ways that objects are, but ways that two or more objects are such that they are related. As such, relational properties are further elements of reality, and do not hold purely in virtue of the nature of those entities they relate. For example, if ‘loving’ is a genuine existing relation such that ‘John loves Mary’ is true, then it is a non-formal relational property. The ‘loves’ relation tells us something about the way that John and Mary are.

Lowe is an ‘immanent’ realist about universals. This is because Lowe thinks that entities that do not exist ‘in’ space and time, such as transcendent universals, are causally inert and therefore cannot play the role in perception and causation that properties of objects are required to play (2006a: 98). However, Lowe rejects the view that an immanent universal is ‘wholly present’ in all of that universal’s instances, due to the view being committed to an ‘inexplicable mystery which borders on incoherence’, in having to hold that the same universal could be wholly present in two places at the same time.

Instead, Lowe supports a ‘weak’ doctrine of immanence which ‘just amounts to an insistence upon the instantiation principle—the principle that every existing universal is instantiated. Applied to a universal such as the property of being red, it implies that this universal must have particular instances which exist “in” space and time, but it doesn’t imply that the universal itself must literally exist “in” space and time’ (2006a: 99). This solution, though, requires a commitment to both the existence of (instantiated) universals and modes. We have already seen that this is something that Lowe is willing to endorse, but again it is of note that the holistic and systematic nature of Lowe’s ontological theorising is part of the reasoning that gives rise to these commitments.

It should additionally be noted, that in line with the comments above about Lowe’s views on the relationship between language and metaphysics, that Lowe does favour a somewhat sparse conception of properties, at least in the sense that he does not think that every meaningful predicate refers to real property (2006a: 122). In fact, Lowe generally is of the view that far fewer than all meaningful predicates express real properties; however, the job of the philosopher is not primarily to decide which predicates are the ones that express real properties. That, rather, is left to the more empirically informed aspect of our research, just so long as the overall ontological framework is taken into account when considering each case.

d. Universals

A further reason for the positing of non-substantial universals comes from the commitment to kinds, or substantial universals, for if there are kinds, then it cannot be that kinds are characterised by particular property-instances, but instead must be characterised by universal properties. What this means is that given that Lowe defends the existence of kinds (more on this in a moment), it must be that such kinds are characterised by universal properties, not by particular properties. A particular property, or mode, is instantiated by a particular substance, not some kind of object. Universal properties can only characterise universal substances, and particular properties can only characterise particular substances.

Perhaps the main reason that Lowe endorses the existence of universals comes from concerns about laws of nature, arguing that to account fully for such laws we must posit both substantial and non-substantial universals.

Lowe criticises one common universal-invoking account of laws: that natural laws are relations between universal properties as a second-order relation of necessitation (see Armstrong 1983). Under such views, the form of a law is ‘F-ness necessitates G-ness’ and this entails the constant conjunction amongst particulars that ‘For any x, if x is F, then x is G’. However, Lowe argues that laws do not in fact entail constant conjunctions amongst particulars, because ‘laws—apart, perhaps, certain fundamental physical laws—admit of exceptions, which arise from the possibility of interfering factors in the course of nature’ (2006a: 29).

Lowe argues that we should think of laws of nature as determining ‘tendencies’ in the particular objects that they apply to, which result from the complex interaction of multiple laws. This means, and leading from the ontological square above, laws consist, in the simplest cases, of kinds being characterised by some non-substantial universal or property, or, in two or more kinds being characterised by a relational universal. There is no need to invoke second-order necessitation relations, and we can more directly read the correct form of a law from our everyday talk: ‘The basic form of a law is not ‘F-ness necessitates G-ness’, but ‘Ks are F’, or ‘Ks are R-related to Js’, where ‘K’ and ‘J’ denote substantial universals, ‘F’ denotes a property and ‘R’ denotes a relation—that is, where ‘F’ and ‘R’ denote non-substantial universals (2006a: 30).

For example, if it is a law that ‘rubber stretches’ this is to say that things of kind ‘rubber’ is characterised by the non-substantial universal of ‘stretchiness’, or if it is a law that ‘Protons and electrons attract each other’ this is to say that the kind ‘proton’ and the kind ‘electron’ are characterised by the ‘attraction’ relation.

This account, additionally, has the benefit of distinguishing logically between statements of laws, and the corresponding generalisations—between ‘Violets are blue’ and ‘All (particular) violets are blue’ (2006a: 94). One is a statement of law; the other is a statement about all instances of a kind. That is, one tells us something about the nature or essence of the kind ‘violets’, whilst the other tells us something about all the particulars of that kind, which might be something that is not of the essence of the kind. For example, ‘all swans are white’ might be true in that all particular swans might be white. ‘Swans are white’, in contrast, is a statement about the kind swan and is false as the kind swan is not characterised by the property of whiteness—it is not part of the essence of that kind.

Thus, under this account, we have no need to invoke some new relation (in the formal or the ontological sense) to explain laws of nature—all that is required is the already posited relation of characterisation, but on this occasion holding between substantial and non-substantial universals instead of particulars. No further second-order necessitation relation is required.

Lowe does not think that laws of nature are (always) necessary states of affairs. This is because ‘natural’ or ‘physical’ necessity—that which laws of nature are about—is a species of ‘relative’ necessity: ‘a matter of what is necessarily the case given that some contingent truth obtains’ (2006a: 132). Natural necessity is therefore not the same as genuine metaphysical necessity. As it is the case, for Lowe, that all natural laws concerning a kind involve all and only those properties that belong to essence of that kind, the laws of nature may not be necessary in the metaphysical sense. For example, Lowe denies that it is part of the essence of water that it dissolves salt, as he thinks it possible that water—the same substance—could exist in a possible world in which it does not dissolve salt. Instead,

[a]t most we can say that if there is a law, in a given possible world, that water dissolves common salt, then it follows of necessity in that world that any particular quantity of water has a tendency or disposition to dissolve any piece of common salt with which it may come into contact. (2006a: 132)

This, of course, leaves open the question of what is the essence of water—this is discussed in section 4. However, we can see that laws of nature about water are only physically necessary as it could have been that water was different, ruling out the claims from being metaphysically necessary.

e. Kinds

Kinds, or substantial universals, are, for Lowe, abstract objects. This is because kinds satisfy two plausible ways in which an entity might be thought to be abstract.

First, abstract could be contrasted with concrete, where a concrete entity exists in space and time, whilst an abstract entity does not. We should not take this to mean that abstract entities and concrete entities have different types of existence. Rather, to be abstract or concrete is to have certain sorts of properties, or better to essentially have certain sorts of properties. Thus, we can hold that an entity is concrete if it essentially has spatiotemporal properties or relations, and an abstract entity does not essentially have any spatiotemporal properties or relations. A table is concrete as a table essentially possesses some spatiotemporal properties (a particular table must be somewhere and somewhen), whilst numbers are abstract as they do not essentially have spatiotemporal properties.

Second, an abstract entity is one that is logically incapable of existing independently. Here, we mean metaphysically independent rather than being independent in thought. So an abstract entity is one that cannot exist independently of some further entity. For example, as we have seen, the particular shade of red of an apple might be thought to be incapable of existing without the further existing of some other properties of that apple, or the apple itself.

Thus, to give an example, the kind horse does not essentially possess any spatiotemporal properties even if particular horses do. As an immanent realist, Lowe does not think the kind is ‘wholly present’ where the instances are. The kind ‘horse’ also cannot exist independently of there being instances of that kind. This also is in line with the weak immanence thesis such that every existing universal is instantiated (2006: 99–100).

Thus, we can conclude that kinds must be objects, as to be an object is to have determinate identity conditions, where if x and y are objects, then there will be some ‘fact of the matter’ as to whether x is identical to y or not (1995: 511–513); and are abstract for the above two reasons. Note, that this does not mean that Lowe denies that there might be particular objects that are abstract. If numbers should be thought of as objects, then they would appear to satisfy the conditions of being abstract particular objects. Again, Lowe’s conception of ontology is such that it need not take a firm position on this in order to the delineate the categories and their formal characteristics. Instead, Lowe was keen to build a system first, and to later consider what entities, if any, fall under which categories.

f. Further Formal Ontological Relations

Despite the focus on characterisation, and instantiation in the preceding discussion, they are not the only formal ontological relations that Lowe is committed to. The following will briefly summarise some other key formal ontological relations in Lowe’s system. Dependence will not be discussed directly, but this is not because Lowe had little to say about dependence. In fact, the opposite is true (see 1994, Tahko and Lowe 2015), though Lowe’s work on this is harder to provide an overview of in an accessible way. Rather, as noted above, dependence is taken by Lowe to be a family of relations, founded upon other formal relations, including those mentioned above, and those discussed in this section.

i. Exemplification

The relation of exemplification holds, diagrammatically, diagonally between particular objects, and non-substantial universals. Therefore, Lowe holds that an object can exemplify an attribute in two ways: the object may instantiate a kind, which is characterised by the attribute; or the object may be characterised by a mode which instantiates that attribute. Exemplification is thus not fundamental, as it can be analysed as two distinct patterns of instantiation and characterisation relations.

Though not fundamental, exemplification is important in Lowe’s system, and as is the distinction between the two ways in which an object might exemplify an attribute. This is because these two ways to exemplify an attribute express the distinction between occurrent (or categorical) and dispositional predication—the difference between saying ‘This stuff dissolves in water’ and ‘This stuff is dissolving in water’. For Lowe, both of those predications express the exemplification of the same attribute (non-substantial universal), but do so in distinct ways.

On Lowe’s view then, it is not strictly correct to distinguish between, as many do, between occurrent (or categorical) and dispositional properties. But, this distinction does have an ontological ground—it is not merely a difference in language:

A sentence of the form ‘a is occurrently F’ means ‘a possesses a mode of Fness’, whereas a sentence of form ‘a is dispositionally F’ means ‘a instantiates a kind K which possesses Fness’. Thus, according to this view, properties (in the sense of universals) primarily characterize kinds and only derivatively or indirectly characterize individual substances or objects. (2006a: 125)

This is an ontological difference in how this indirect characterisation occurs, although not one where the difference lies in there being distinct types of properties (compare Heil 2010 and the view that properties are ‘powerful qualities’).

ii. Identity

Identity, for Lowe, is purely formal, rather than being a relational property. This is because whilst Lowe does not think that questions about self-identity are trivial, having to do with complex issues about identity conditions, nor are unimportant—it being in virtue of self-identity that objects are countable and can constitute a plurality; identity (and self-identity) is a necessary condition upon the existence of objects. This makes identity too fundamental to be something in the world, and rather describes how items are in the world.

As detailed in section 4, understanding the identity conditions of an object is a crucial aspect to understanding the essence of the object, as these identity conditions are supplied by the kind that the object instantiates, which is itself part of the essence of that object—an object cannot become a different kind of object without the original object ceasing to be.

iii. Composition

The distinction between constitution and composition is important for many reasons, but perhaps in Lowe’s work this distinction is best known as being at the centre of the claim that that a statue and the lump of bronze that it is composed of are distinct. That is, Lowe’s defence that where there is a statue, there are two non-identical overlapping objects that are the statue and the lump of bronze (see 2009a: chapter 6; 2006a: 49–51).

Composition is a many-one relation that holds between a (non-simple) whole and (some) of its proper parts. A bronze statue is composed of the bronze atoms that are the proper parts of the statue. The conditions under which an object can be composed are given by the kind (at the relative level of composition) that the object is an instance of. Thus, the composition conditions of a bronze statue are different from the composition conditions of the bronze atoms (which are composed of sub-atomic particles) that compose the statue. Therefore, as an example, some bronze atoms compose a lump of bronze at a time t just in case (1) those bronze atoms are fused together over a period of time to which t belongs and (2) during that period there are no other bronze atoms with which any of them are fused. (2) ensures that the lump of bronze is ‘maximal’, meaning that during the period in which we are discussing the composition conditions of that lump of bronze, it cannot be fused to further bronze atoms—that is, the lump of bronze is not a proper part of some further larger lump of bronze (2006a: 50); and (2) also ensures that there cannot be two spatially coinciding objects of the same kind. Lowe rejects this due to the problems that such a possibility give rise to with respect to individuating those distinct objects (1998: 202; 2002a: 71).

From this Lowe concludes that a lump of bronze and the statue it composes have different composition conditions, as it is case that the condition on a statue must include its shape, whilst this is not the case for a lump of bronze. Furthermore, a statue can be composed of different lumps of bronze over its lifetime (1995b). This is not possible for a lump of bronze for if a lump of bronze were to lose one of its bronze atoms then it would cease to be that (original) lump of bronze. In this way, composition conditions are closely related to persistence conditions.

iv. Constitution

Constitution, for Lowe, is not identity, but rather is ‘the closest way in which two entities can be related while still remaining numerically distinct’ (2006a: 51). Perhaps the closest Lowe comes to providing a precise definition of constitution, though one explicitly restricted to cases in which both x and y are composite objects, is that ‘x constitutes y at time t just in case x and y coincide spatially at t and every component part of x at t is also a component part of y at t, but not every component part of y at t is also a component part of x at t’ (2009a: 89). Constitution can be said to result in the view that there can be two distinct spatially coinciding objects.

An example is perhaps the best way to get at Lowe’s conception here. Take again our statue and lump of bronze. The statue has the shape and weight it has in virtue of the shape and weight of the lump of bronze. The ‘in virtue’ of phrase is for Lowe ‘typically apt’ when constitution holds. But the lump of bronze constitutes the statue—they are distinct entities. We know they are distinct because they have distinct identity and existence conditions. This is not, though, discovered empirically—rather we know it because in grasping part of the essence of statues and lump of bronze, we know that they have distinct essences and thus are distinct entities (2008b: 46).

That constitution is not identity is also important for Lowe’s solution to the problem of Tibbles, raised by Geach (1980) in his argument for relative identity. Without going into full detail of the case, we want to say that both ‘Tibbles is a cat’ and ‘The lump of feline tissue, c, is a cat’ are true. This raises a puzzle if we imagine some proper part cn of c that contains all of c except for one hair. If ‘c is a cat’ is true, then presumably so is ‘cn is a cat’. Extending this, we now seem to have to accept, as the full example goes, 1001 different cats all sitting on the mat.

Lowe’s distinction between constitution and identity allows for a solution to this. Lowe argues that we must recognise that the sortal terms ‘lump of feline tissue’ and ‘cat’ have different criteria of identity associated with them. The removal of one part of a lump renders the remaining lump a different lump—we cannot take a hair away from c without destroying c. However, the same is not the case with Tibbles, as Tibbles might, as in other extensions to the example, lose its tail, but this would not destroy Tibbles.

This difference in the criterion of identity of c and Tibbles indicates to us that there are two different senses of the predicate ‘-is a cat’. c is a cat only in the sense of c constituting a cat, whilst Tibbles is a cat in the sense of Tibbles being an instance of the sortal kind ‘cat’. ‘-is a cat’ is not ambiguous once we recognise the distinction between the ‘is’ of constitution and the ‘is’ of instantiation (2009a: chapter 6).

g. Persistence and Change

i. Endurantism vs Perdurantism

Lowe’s views on persistence, temporary intrinsics, and change are perhaps best shown in a dialogue that he had with David Lewis in Analysis in the late 1980s. In these papers, Lowe outlines his rejection of temporal parts, and of Lewis’ still standard distinction between endurantism and perdurantism. In Lewis’ terminology, ‘something perdures iff it persists by having different temporal parts, or stages, at different times, though no one part of it is wholly present at more than one time; whereas it endures iff it persists by being wholly present at more than one time’ (Lewis, 1986: 202).

Lowe rejects this way of framing the question, and thereby rejects both endurantism (so conceived) and perdurantism. On endurantism, Lowe argues that, in parallel to the above arguments about universal properties, that ‘there is no useful notion of such a thing being “wholly present” at a time’ (1987a: 152). The issue is that ‘wholly present’ must be contrasted with the notion of ‘partially present’. However, if we were an endurantist on Lewis’ conception, the idea of a ‘partially present’ object simply makes no sense.

Lowe rejects perdurantism as he rejects the existence of temporal parts for ordinary, concrete objects, noting that he finds the notion ‘scarcely intelligible’ (1987a: 152; he accepts as possible, though without committing himself to, the view that events and processes have temporal parts; see 1998: 99–100).

More substantively, Lowe thinks that the only way to get some grip on the notion of temporal parts is by analogy to spatial parts. Lowe thinks that concrete things can only have spatial parts if the things are extended in space. Following the analogy, a concrete thing can only have temporal parts if it is extended in time. However, this means that the debate is no longer about endurantism or perdurantism:

the perdurance versus endurance debate doesn’t really hinge upon issues in mereology (the study of part–whole relations) as such, but rather upon the question of whether anything…is extended in time, in anything like the way in which things are extended in space. But this is at bottom a question about the nature of time, rather than a question about the nature of things existing in it. The question is whether we can properly talk about time as being some sort of dimension of reality, relevantly akin to the three dimensions of space. (1998: 102)

Indeed, Lowe, in a set of later papers (though clearly echoing the above sentiments), ultimately argues that the distinction between 3D (roughly endurantist views) and 4D (roughly perdurantist views) descriptions of the world ‘are equivalent in the sense of being intertranslatable without remainder, and [Lowe and McCall] take the position that there is no “fact of the matter” as to whether we live in a 3D or 4D world’ (2006c: 570; see also 2003b).

The main reason for this is that in the case of some particles that have no parts, and exist at only one time, we can describe those particles ‘indifferently’ as instantaneous 4D temporal parts, or as 3D objects that exist only at one time, with a one–one relationship between such descriptions (2006c). This claim should be considered alongside Lowe’s related but separate claim that perdurance and endurance account of persistence are in fact equally good at handing problems such as vagueness (see 2005b, a response to views expressed in Sider 2001 and Hawley 2001).

ii. Persistence and Intrinsic Change

What then is Lowe’s view about persistence over time? To get at this, we must distinguish between the metaphysical and the semantic problems, as each requires their own answer—indeed, Lowe thinks that Lewis’ solution fails in part because it tries to provide both a semantic and metaphysical solution at once (1988).

The semantic problem is that of specifying the logical form of sentences ascribing temporary intrinsics to objects. Lowe’s solution is ‘adverbalism’. This solution to the correct form of such sentences is ‘a is-at-t F’—that is, that it is the having of a property that is relativised to a time. Thus, an object is not simply characterised by a property, but instead the relation of characterisation—or whatever relation holds between an object and a property that it has—is relativised to time. The main reason for Lowe’s endorsement of this is that he thinks it is the ‘least revisionary’ to our common-sense talk of objects persisting through change when compared to solutions that relativise the property such that it is in fact a relational property, and that relativise the object itself—that is posit temporal parts.

We can see the intuitive pull of this view when we (cautiously) consider the analogous case in spatial properties, as the sentence ‘The Thames is broad in London’ is best analysed to understand what we mean by that sentence as ‘The Thames is-in-London broad’. Again we can see that it is the ascription of a property is, in this case, spatially relativised (1988: 73–75).

The metaphysical problem of intrinsic change is the problem for how there can be objects for which the semantic problem arises—how there can be objects that can seemingly survive through change. To this problem, Lowe’s solution is that the identity over time of objects is founded in the preservation of certain relationships between that object’s constituent parts at any given time. Thus, a tree can survive a change in its properties because its ‘diachronic identity is consistent with a degree of replacement and/or rearrangement amongst its components, sufficient to allow for growth and maturation and so forth’ (1988: 76). This replacement or rearrangement explains how an object can change its shape and yet remain the same object as the change in shape can be explained as a change in the relations between the object’s constituent parts, and the shape of the object supervenes on these relations between its constituent parts.

There are two main consequences of this view (see Lewis 1988 for a discussion of both). First, Lowe has to deny that constitution is identity. However, we have already seen that Lowe accepts this claim independently. Second, Lowe is committed to there being fundamental particles that have their intrinsic properties unchangeably. This, again, is something Lowe is willing to accept, arguing that classical atoms and fundamental particles of modern physics are posited as having their properties unchangeably.

Note that the question of how much change an object can persist through—its persistence conditions—has not been addressed so far. Lowe’s proposed solution, as with other notions such as identity, composition, and existence conditions, is that an object inherits its persistence conditions from the kind that that object is an instance of, and in turn the kind has those specific conditions as it is part of the essence of that kind.

The topics of persistence and change are, of course, related to questions about the nature of time. For some of Lowe’s writings on time, see (1987b, 1992, 1998) where Lowe holds an adverbial view, or (2005b) where Lowe leans towards presentism.

4. Essence

As the last paragraph of the preceding section made clear, the notion of essence plays a significant role in Lowe’s metaphysics. This section outlines what Lowe means by essence, how we might some to know the essence of some entity, and it highlights some further crucial theoretical roles that essence plays for Lowe and his ‘serious essentialism’ thesis that essences exist, but are not further entities (see 2013a: chapter 8).

a. What Are Essences?

Lowe claimed that the closest that we have for a definition of what an essence is comes from Locke: ‘the very being of any thing, whereby it is what it is’ (Locke, 1975: III, III, §15). Alternatively, we can approach the notion via the Aristotelean idea of a ‘real definition’, as opposed to a ‘verbal definition’: ‘A real definition of an entity, E, is to be understood as a proposition which tells us, in the most perspicuous fashion, what E is—or, more broadly, since we do not want to restrict ourselves solely to the essences of actually existing things, what E is or would be’ (2012a: 104–105). To ask what the essence of an entity is to ask for the real definition. It is to ask for a definition of that thing.

Though heavily inspired by Locke, Lowe stresses that, contra Locke, essences are not further entities, since if all essences were entities, and all entities had essences, an infinite regress would arise. Further to this, essences are also not entities, as essences are, in a sense, the identity of an entity. This is because to express the essence of an entity is to express its identity and existence conditions (from which other knowledge such as the entities persistence conditions can be derived). Expressing these identity and existence conditions is to express what that entity essentially depends upon, which ultimately is to express its essence.

This might seem strange given the above quote about real definitions being understood ‘as a proposition’. However, the real definition may be a proposition, but only as this proposition expresses the essence of the entity. The essence is not a further entity of any kind: not a set of identity and existence conditions, or a proposition. Therefore:

To know something’s essence is not to be acquainted with some further thing of a special kind, but simply to understand what exactly that thing is. This, indeed, is why knowledge of essence is possible, for it is a product simply of understanding, not of some mysterious kind of quasi-perceptual acquaintance with esoteric entities of any sort. And, on pain of incoherence, we cannot deny that we understand what at least some things are, and thereby know their essences. (2013a: 147)

This insistence that an essence is not a further entity is one reason that Lowe’s account can be distinguished from perhaps the best-known account of essence, especially as it derives from the work of Kripke and Putnam. Under that account, essences are discovered a posteriori as the essence of an entity is what that entity consists of—the essence of water consists in its molecular make-up of H2O, or the essence of a living organism consists in its DNA. However, this makes the essence of an entity some further entity, opening the way for the possibility of an infinite regress once we ask what the essence of those further entities are.

Providing a clear illustrative example of an essence of an entity is difficult. Lowe thought that specifying or providing the real definition of an entity is incredibly hard, even though we can know aspects of the essence of entities. The one normally given, that Lowe borrows from Spinoza, is that of a circle:

Circle: A circle is the locus of a point moving continuously in a plane at a fixed distance from a given point. (2012a: 105; see Spinoza 1955)

This tells us what a circle is, and what Lowe termed its generating principle—what it takes for a circle to come into being. It is a necessary truth about circles because it is part of the essence of what it is to be a circle. However, importantly, not all necessary truths will be essential truths. This is because certain necessary truths, as mentioned above, are not metaphysically necessary, but only physically necessary.

That we can know something of the essence of non-existing things means that for Lowe essence precedes existence (2013a: 148). The reason that Lowe thinks this is tied to the metametaphysical claims is discussed in section 2—to find out if some X exists, we must first know what X is. This is not to deny that to understand the essence of something might have first discovered the existence of certain other kinds of things. For example, we knew what transuranic elements were before we discovered them, but only because we had already learned about the composition of other atomic nuclei and thus that what we were trying to find was elements with new combinations of protons and neutrons. Those transuranic elements could not even have been understood prior to the discovery of sub-atomic particles, but given that discovery we could come to know some part of the essence of some elements that we at that time had not empirically discovered.

Note, that this also counts against the a posteriori nature of the Kripke–Putnam view. Given that essences are not further entities, they are not things out in the world to be discovered. As we have seen, this is not to deny that there is perhaps some empirical knowledge required prior to understanding the essences of some entities. It is only to say that some a priori grasping of an entities essence is required first.

b. Modality and Essence

One major theoretical role that essences play in Lowe’s ontology is that, contra Kripke–Putnam essentialism, and in line with other supporters of (broadly conceived) Aristotelean essentialism (See Fine, 1994, 1995a, 1995b; Oderberg 2007, 2011; Koslicki 2012), essence is ontologically prior to modality. Essences should not be reduced to de re modal properties: ‘essences are the ground of all metaphysical necessity and possibility’ (2013a: 152; see also 2011b).

Much of the reason for this comes from Lowe’s arguments that other accounts, most prominently those built around the notion of ‘possible worlds’ are flawed. At heart, Lowe’s objection to such views is that they do not actually explain what they set out to—modal truths. This is because the very notion of a ‘possible world’ upon which such views must rely is itself highly obscure. Thus, in the end, have to resort to a form of modal primitivism whereby modal truths have to be taken to be brutely true or false. For this reason, Lowe argues that it is better to take essence to be the more fundamental notion, as essence can both be more readily independently grasped, and used to explain modality. (There is not space here for a full overview of Lowe’s criticisms of the various versions of alternative views; for more, see 2013a: chapter 8, 2008b; 2012c.)

c. Categoricalism

A further major element in Lowe’s account of essence is his distinction between general and individual essences:

If X is something of kind K, then, X ’s general essence is what it is to be a K, while X’s individual essence is what it is to be the individual of kind K that X is, as opposed to any other individual of that kind. So suppose, for example, that X is a particular cat. Then X’s general essence is what it is to be a cat and X’s individual essence is what it is to be this particular cat, X. (2013a: 145)

The individual essence is required in addition to a general essence to ensure that being a particular entity is distinct from being just some entity of a particular kind. That is, as the general essence is shared by all entities that are Ks, the individual essence allows us to individuate between different Ks. Specifying the essence of an entity is to express that entity’s identity conditions, and identity conditions (or criterion of identity) are what allow us to individuate entities (see 1989; 2013a: chapter 5).

This distinction is closely tied to Lowe’s thesis of categoricalism—the view that one necessary condition on a thinker’s ability to pick out single objects in thought is the grasping of a categorial concepts under which the object is conceived to fall (2013a: 21).

The question this is answering is how can we comprehendingly have singular thoughts about objects. Categoricalism is the answer. For Lowe, we cannot have singular thoughts about, to use Lowe’s example, a cat, Oscar, unless we have already grasped that Oscar falls under the categorial concept of ‘living being’, as this would appear to be the narrowest general concept that Oscar could fall under.

Of course, Oscar falls under other categories also—such as animal, and cat—but these are subcategories of the more general category of living organism. This explains why we may have singular thoughts about Oscar even if we mistakenly believe that Oscar is a dog (because, say, we have misheard my neighbours and not actually seen Oscar ourselves). Categoricalism allows that we might mis-categorise objects as it only requires a sufficient grasp of the essence of an entity; but it does rule out situations in which we thought that Oscar was actually a piece of furniture. In such cases, it seems correct to say that we have not actually grasped the essence of Oscar at all.

One immediate objection here might be that we could use notions such a ‘thing’ or ‘entity’ in which case we would always, trivially, be able to grasp part of the essence of an entity. However, as noted above, notions like ‘thing’ and ‘entity’ are transcategorial. This means that they cannot provide us any essential knowledge about the entity in question. Transcategorial notions cannot allow for thinking about an object comprehendingly because the terms do not express categories, and therefore do not provide implicit or explicit knowledge of the relevant object’s criteria of identity.

We can see that categoricalism has a major consequence for Lowe—it means that Lowe thinks that we cannot think comprehendingly about any entity without first grasping some aspect of its essence. The requirements for grasping a part of an essence are minimal, and Lowe is explicit that he thinks that even young children are capable of doing this. But, crucially, because we do not require the ability to grasp the full essence of an entity to think comprehendingly about it, nor do we require empirical knowledge to grasp part of an essence of an object. As seen above, the statue and the lump of bronze are empirically identical, but we can distinguish them because we know what kind of thing they are—that is, what their identity and existence conditions are, and therefore what they essentially depend upon.

5. Mind, Persons, and Agency

Alongside and intertwining with the above described complex ontological system, Lowe defended some less commonly held positions in the metaphysics of mind. This last section outlines some of the key aspects of these positions, though again noting that this due to space limitations must be taken as only a survey of his thinking on these matters, and, in particular, one overlooking significant negative arguments Lowe developed against the alternative positions.

These positions, especially his views on persons, are driven by what Lowe thinks about substances, properties, and other metaphysical and ontological topics already covered in this article. That is, there is a sense in which Lowe’s views about persons, agency, and the mind can be seen as an application to these debates of the metaphysical principles and views that Lowe defended. For example, throughout this section, the role of identity conditions is central. Similarly, in Lowe’s work on mental causation, universals play a significant role. This, of course, does not mean that we must accept Lowe’s positions in the philosophy of mind if we accept his broader ontological picture, or vice versa. Rather, this is only to highlight the intricate and systematic nature of Lowe’s philosophical views.

a. The Non-Identity of Mental and Physical States

The non-identity of mental and physical states for Lowe ultimately comes from his claim that the two have different identity conditions, and, as seen above when discussing identity more broadly, if two entities have distinct identity conditions, then they cannot be entities of the same kind.

It can, of course, subsequently be asked in what way are the identity conditions for mental and physical states different. One difference that Lowe appeals to is that a physical state is, by its essential nature, a thing whose possession makes a difference to at least part of the space that the thing that possesses it occupies. For example, the property of being sitting is physical, as in virtue of possessing that property a person fills space in a particular way. In contrast, Lowe holds that there are no such spatial connotations for mental states. As such, mental and physical states have distinct identity conditions and thus cannot be of the same kind (2008a: 22–23).

Lowe in fact wishes to go further, stating that physicalism simply cannot be true and is an unintelligible thesis. One reason for this claim is that he thinks that truths about identity cannot be exciting in the way that physicalism would require. This is because identity statements can only intelligibly hold between entities of the same kind. However, ‘exciting identifications—of physical objects with mathematical objects, or of mental states with physical states—all violate this principle by trying to identify items of quite different kinds’ (2008a: 23, chapter 5).

b. Non-Cartesian Substance Dualism (NCSD)

Lowe, as we have seen above, holds that a substance is an individual or a bearer of properties. In the case of mental properties (such as pain and desire), this bearer is the subject of experience, with human persons being a prime example of such subjects (though note that for Lowe non-human animals might be also considered subjects of experience, and as such his non-Cartesian substance dualism is not inherently restricted to humans only). As well as this subject of experience, there also exists a physical body—a substance that is the bearer of physical properties. Persons are to be identified with the subject of experience rather than the biological organism. Two distinct substances exist (the person and the body), but they are not identical with each other‘a human person is not identical with his or her “organized body” nor with any part of it’ (2008a: 95–96; see also 1996: chapter 2). Indeed, for Lowe, the non-identity of the self with its body or any part of it implies that the self is a simple, non-composite substance (2001).

Famously, Descartes’ dualism additionally held that a person cannot be identified with the person’s brain or body as the person can only be the bearer of mental properties, and not physical properties. Lowe is clear that his version of dualism is not committed to this additional claim. Instead, Lowe rejects the idea that persons can only have mental properties:

this sort of [non-Cartesian] substance dualist may maintain that I [a person] possess certain physical properties in virtue of possessing a body that possesses those properties: that, for instance, I have a certain shape and size for this reason, and that for this reason I have a certain velocity when my body moves. (2008a: 95)

This, though, is not to say that every physical property of the body is also possessed by the person, as otherwise the view would collapse into the view that the person is the body.

Thus there are two distinct substances, a subject of experience (a person) and the physical body that the person possesses, and, contra Descartes, the person can be the bearer of psychological and physical properties. This has an important consequence that Lowe does hold that persons are not necessarily separable from their bodies, in the sense of being capable of disembodied existence. This is because Lowe thinks that it is part of the essence of what it is to be a human that we have bodies. If there were just disembodied minds, then that would not be a human.

As discussed in more detail in section 5d, Lowe’s non-Cartesian substance dualism is a form of interactionist dualism—he is committed to the claim that at least some mental events cause changes in the physical world.

c. The Unity Argument for NCSD

Above we have largely just asserted in line with substance dualism that a person is not to be identified with their body. Lowe does provide arguments for this; here we focus on an argument that Lowe described as the strongest (2008a: chapter 5.2, see also 2010, 2014). The argument is as follows:

(1) I am the subject of all and only my own mental states.

(2) Neither my body as a whole nor any part of it could be the subject of all and only my own mental states.

Therefore,

(3) I am not identical with my body nor with any part of it.

Lowe takes (1) to be a self-evident truth (see 2006b for a defence of this from responses from certain psychopathological conditions). So it is (2) that requires a defence.

The defence comes from the assertion that that ‘no entity can qualify as the subject of certain mental states if those mental states could exist in the absence of that entity’ (2008a: 96). That is, mental states must have a subject, and it is not possible for the very same mental states to belong to a different subject than the one that the do in fact belong to.

However, the same cannot be said of the body. Whilst it may be true that if we were to lose some parts of our body then we might lose some mental states—we might lose certain sensations, though not always as shown through instances of ‘phantom pain’—we would still have in such cases many of the same mental states despite not having that bodily part. This means that many, if not all, of our mental states could exist even if our bodies, as a whole, did not exist. Our bodies might be different in terms of possessing different parts, but in those circumstances, we could still have the same mental states. This shows us, for Lowe, that the body as a whole cannot be the subject of mental states of all of and only our specific mental states, and thus why we cannot be identified with our bodies.

If this line of reasoning is accepted, it is further apparent that the physicalist cannot respond by saying that it is the brain, and not the body, that is identical to the self, as the same argument can be run, except replacing ‘body as a whole’ with ‘brain as a whole’ with the same conclusion.

To be clear, this is not a claim that if the brain were destroyed then our mental states would continue to exist. Lowe’s account is not committed to the view that the mind could continue to exist without the brain. Rather, Lowe’s claim is that a person’s mental states do not depend on any particular part of the brain in the way that they do depend on the person continuing to exist—that there is no part of the brain which is such that were any part of it destroyed (say, one neuron destroyed), then all of the person’s mental states would cease to be. The same cannot be said for the person, especially in light of Lowe’s claim that persons are simple, non-composite objects.

The unity of mental states with the subject that those mental states belong to thus, for Lowe, shows that the body cannot be identified with the person as the subject of experience.

d. Mental Causation

Given the interactionist nature of Lowe’s dualism, a major issue that arises is arguments for the causal closure of the physical, and providing an understanding of the nature of mental causation.

A central part of Lowe’s case that mental causation is a real phenomenon, and not something that can be reduced to physical causation, is the recognition that mental causation is intentional, unlike physical causation. Physical causation does not have this feature, and Lowe argues that we need both sorts of causation in order to fully account for human behaviour. He writes:

Intentional causation is fact causation, while bodily causation is event causation. That is to say, a choice or decision to move one’s body in a certain way is causally responsible for the fact that a bodily movement of a certain kind occurs, whereas a neural event, or set of neural events, is causally responsible for a particular bodily movement, which is a particular event. The decision, unlike the neural event, doesn’t causally explain why that particular bodily movement occurs, not least because one cannot intend to bring about what one cannot voluntarily control—for, as I pointed out earlier, one cannot voluntarily control the precise bodily movement that occurs when one decides, say, to raise one’s arm. (2008a: 110)

The claim is this: we have voluntary control over certain actions as shown simply by our everyday experience of the world. A person cannot have voluntary control over the neural causes of a particular action, in part shown through the multiple realisability of neural causes. But, to understand why a particular event happened, it is not sufficient to know that an event of that kind occurred. For Lowe, only intentional causation can provide that kind of explanation, and, as we cannot intend to bring about what we cannot voluntarily control, it must be the case that there is a further, non-neural, mental cause of voluntary actions.

The mental decision, D, does not cause the particular bodily event, B, as it is the neural cause, N, that is causing the particular bodily movement. The occurrence of D is compatible with both B and B* occurring, as distinct particular bodily movements caused by N and N* respectively. However, D is required to fully explain why an event of the kind B occurred.

A significant upshot of this account of mental causation is that Lowe argues it means that we can avoid even the strongest form of arguments from the causal closure of the physical (see 2000c, 2003a, 2008a: chapters 2 and 3 for more extended discussions about the causal closure of the physical, including in-depth discussions of its various different forms). The form Lowe cites is as follows:

1) No chain of event-causation can lead backwards from a purely physical effect to antecedent causes some of which are non-physical in character.

2) Some purely physical effects have mental causes.

3) Any cause of a purely physical effect must belong to a chain of event-causation that leads backwards from that effect.

Therefore,

4) All of the mental causes of purely physical effects are themselves physical in character (from 2008a: 100–101, numbers changed from original).

In this form, Lowe rejects (3). It is not only event-causation that is involved in explaining the voluntary bodily movements of humans. We also require intentional-causation, but intentional-causation is fact-causation. Mental states are thus causally efficacious in determining what kind of event occurs, and, for Lowe, this is entirely compatible with the claim that some particular physical bodily movement, B, is caused by a particular neural event, N.

e. Agent Causation

In the above, we have made use of the notion of a voluntary action, without expanding upon it. In this last section, we outline in brief Lowe’s view of willing action, and agent causation.

Lowe is clear that he thinks that strictly speaking there is no such thing as event causation. Rather, there is only agent causation—that is, only causation by agents which is agents acting in some manner. This means that whilst agent causation is in a sense primary, it not the case that the agent just causes the event qua agent, thereby rejecting classical agent causalism and libertarianism as both distinguish between different types of causation, whereas Lowe does not. Instead, the agent causes an event by willing, or having a volition to perform, some action.

An agent, in the above statement, can include inanimate objects. However, when it comes to humans, and voluntary human behaviour, the agent causes some event by willing to cause an event of that type. The act of willing is an event, ‘but not merely an event: it is an action of A’s—indeed, it is a primitive action of A’s, because it is not further analysable in terms of more basic actions of A’s and the consequences of such actions’ (2008a: 7; chapter 9).

Such willings, or volitions to do such-and-such, are for Lowe the most basic kind of action that a free agent can perform, and they are completely uncaused and spontaneous. The idea of uncaused events is of course controversial; however, Lowe argues that it is no more mysterious than the spontaneous decay of a radioactive atom. As before, it is important to see that Lowe is keen here to stress that in order to explain human action, uncaused volitions are a required posit, but also that such volitions are entirely consistent with modern science.

The relationship between the agent and their volitions, though, is a non-causal relation. The relation is ‘internal’: ‘to speak of “performing a volition” amounts to speaking of doing a doing, which is similarly tautologous. This is why it is less misleading simply to say something like “A willed to ϕ”, rather than “A performed a volition to ϕ”’ (2008a: 8).

Lastly, these volitions or acts of the will are performed in light of reasons. Free agents, such as humans, have a special place in the causal world precisely because our agent causation occurs in light of such reasons and rational reflection. Thus, Lowe’s account does not posit some special restricted notion of agent causation. All causation is agent causation, with agents causing events by acting in certain ways. What is special about humans (and potentially other free agents) is that they possess a distinctively rational power of willing certain events to be caused in light of reasons.

This summary of Lowe’s conception of persons and personal agency is admittedly very brief. However, it should be enough to indicate that Lowe’s views are a complex response to the apparent evidence of free choice or action in the world, whilst wishing to propose a theory that is consistent with modern science. Lowe’s views are certainly distinctive, and run counter much of the contemporary literature. For Lowe, the claims that some might find troublesome—non-Cartesian substance dualism, his conception of persons and agent causation amongst others—are warranted in virtue of the fact that Lowe thinks that the other views available and supported in the literature cannot adequately explain what they set out to explain. That is, Lowe’s views on this (and the above discussion of essence and ontology) need to be approached from the view that there is a certain range of phenomenon to be explained, and Lowe thinks that additional posits are required to explain those phenomena.

6. Other Work

As stated, this summary of Lowe’s work has focused on issues in metaphysics, and related topics in philosophy of mind, logic and philosophy of language, focusing mainly on Lowe’s positive views on these topics. Lowe’s work has had numerous influences beyond the scope of this piece.

Some specific areas include his work on Locke (1986a, 1995a, 2005a, 2013b); the ontological argument (2007a, 2012b); truth and truth-making (2003c, 2007c, 2009c); reference (1993, 2012e, 2013a); vagueness (2005c, 2011c); intentionality (1978, 1980, 1982a, 1982b); predication (1986b, 2012d, 2013a); counterfactuals (1979, 1984, 1995c); and consciousness (1995d, 1996, 2006b). Lowe also wrote highly accessible general overviews of metaphysics (2002a) and philosophy of mind (2000a).

7. References and Further Reading

a. E. J. Lowe

  • 1978. ‘Neither intentional nor unintentional’, Analysis, 38: 117-18.
  • 1979. ‘Indicative and counterfactual conditionals’, Analysis, 39: 139-41.
  • 1980. ‘An analysis of intentionality’, Philosophical Quarterly, 30: 294-304.
  • 1982a. ‘Intentionality and intuition: a reply to Davies’, Analysis, 42: 85.
  • 1982b. ‘Intentionality: a reply to Stiffler’, Philosophical Quarterly, 32: 354-7.
  • 1984. ‘Wright versus Lewis on the transitivity of counterfactuals’, Analysis, 44: 180-3.
  • 1986a. ‘Necessity and the will in Locke’s theory of action’, History of Philosophy Quarterly, 3: 149-63.
  • 1986b. ‘Noonan on naming and predicating’, Analysis, 46: 159.
  • 1987a. ‘Lewis on perdurance versus endurance’, Analysis, 47: 152-4.
  • 1987b. ‘The indexical fallacy in McTaggart’s proof of the unreality of time’, Mind, 96: 62-70.
  • 1988. ‘The problems of intrinsic change: rejoinder to Lewis’, Analysis, 48: 72-7.
  • 1989. Kinds of Being: A Study of Individuation, Identity and the Logic of Sortal Terms, Oxford and New York: Basil Blackwell.
  • 1992. ‘McTaggart’s paradox revisited’, Mind, 101: 323-6.
  • 1993. ‘Self, reference and self-reference’, Philosophy, 68: 15-33.
  • 1994. ‘Ontological dependency’, Philosophical Papers, 23(1): 31-48.
  • 1995a. Locke on Human Understanding, London and New York: Routledge.
  • 1995b. ‘Coinciding objects: In defence of the “standard account”’, Analysis, 55: 171-8.
  • 1995c. ‘The truth about counterfactuals’, Philosophical Quarterly, 45: 41-59.
  • 1995d. ‘There are no easy problems of consciousness’, Journal of Consciousness Studies, 2: 266-71.
  • 1996. Subjects of Experience, Cambridge: Cambridge University Press.
  • 1998. The Possibility of Metaphysics: Substance, Identity and Time, Oxford: Oxford University Press.
  • 2000. An Introduction to the Philosophy of Mind, Cambridge: Cambridge University Press.
  • 2000b. ‘Locke, Martin and substance’, Philosophical Quarterly, 50: 499-514.
  • 2000c. ‘Causal closure principles and emergentism’, Philosophy, 75: 571-85.
  • 2001. ‘Identity, composition and the self’, in Soul, Body and Survival, K. Corcoran (ed.), Ithaca: Cornell University Press, pp. 139-58.
  • 2002a. A Survey of Metaphysics, Oxford: Oxford University Press.
  • 2002b. ‘Kinds, essence and natural necessity’, in Individuals, Essence and Identity: Themes of Analytic Metaphysics, A. Bottani, M. Carrara, and P. Giaretta (eds.), Dordrecht: Kluwer, pp. 189-206.
  • 2003a. ‘Physical causal closure and the invisibility of mental causation’, in Physicalism and Mental Causation: The Metaphysics of Mind and Action, S. Walter, and H.-D. Heckmann (eds.), Exeter: Imprint Academic, pp. 137-54.
  • 2003b. ‘3D/4D equivalence, the twins paradox, and absolute time’, with Storrs McCall, Analysis, 63: 114-23.
  • 2003c. ‘Metaphysical realism and the unity of truth’, in Monism, A. Bachli, and K. Petrus (eds.), Frankfurt: Ontos Verlag, 2003, pp. 109-23.
  • 2005a. Locke, London and New York: Routledge.
  • 2005b. ‘Endurance versus perdurance and the nature of time’, Philosophical Writings, 10: 45-58.
  • 2005c. ‘Identity, vagueness and modality’, in Thought, Reference, and Experience: Themes from the Philosophy of Gareth Evans, J. L. Bermudez (ed.), Oxford: Oxford University Press, pp. 290-310.
  • 2006a. The Four-Category Ontology: A Metaphysical Foundation for Natural Science, Oxford: Oxford University Press.
  • 2006b. ‘Can the self disintegrate? Personal identity, psychopathology, and disunities of consciousness’, in Dementia: Mind, Meaning and the Person, J. Hughes, S. Louw, and S. Sabat (eds.), Oxford: Oxford University Press.
  • 2006c. ‘The 3D/4D controversy: a storm in a teacup’, with Storrs McCall, Nous, 40: 570-8.
  • 2007a. ‘The ontological argument’, in The Routledge Companion to Philosophy of Religion, C. Meister, and P. Copan (eds.), London and New York: Routledge, pp. 331-40.
  • 2007b. ‘La métaphysique comme science de l’essence’, in Métaphysique contemporaine: propriétés, mondes possibles, et personnes, E. Garcia, and F. Nef (eds.), Paris: J. Vrin, pp. 85-117. Translated as ‘Metaphysics as the science of essence’.
  • 2007c. ‘Truthmaking as essential dependence’, in Metaphysics and Truthmakers, J.-M. Monnoyer (ed.), Frankfurt: Ontos Verlag, pp. 237-59.
  • 2008a. Personal Agency: The Metaphysics of Mind and Action, Oxford: Oxford University Press.
  • 2008b. ‘Two notions of being: Entity and essence’, Royal Institute of Philosophy Supplement, 83 (62): 23-48.
  • 2008c. ‘Essentialism, metaphysical realism, and the errors of conceptualism’, Philosophia Scientiæ, 12 (1): 9-33.
  • 2009a. More Kinds of Being: A Further Study of Individuation, Identity, and the Logic of Sortal Terms, Malden, MA and Oxford: Wiley-Blackwell.
  • 2009b. Truth and Truth-Making, E. J. Lowe, and A. Rami (eds.), Stocksfield: Acumen.
  • 2009c. An essentialist approach to truth-making, in Truth and Truth-Making, E. J. Lowe, and A. Rami (eds.), Stocksfield: Acumen, pp. 201-16.
  • 2010. ‘Why my body is not me: the unity argument for emergentist self-body dualism’, in Emergence in Science and Philosophy, A. Corradini, and T. O’Connor (eds.), New York and London: Routledge.
  • 2011a, ‘The rationality of metaphysics’, in Stance and Rationality, O. Bueno, and D. P. Rowbottom (eds.), Special Issue of Synthese, 178: 99-109.
  • 2011b. ‘Locke on real essence and water as a natural kind: a qualified defence’, Aristotelian Society Supplementary Volume, 85: 1-19.
  • 2011c. ‘Vagueness and metaphysics’, in Vagueness: A Guide, G. Ronzitti (ed.), Dordrecht: Springer, pp. 19-53.
  • 2012a. ‘Essence and ontology’, in L. Novak, D. D. Novotny, P. Sousedik, and D. Svoboda (eds), Metaphysics: Aristotelian, Scholastic, Analytic, Frankfurt: Ontos Verlag, pp. 93-111.
  • 2012b. ‘A new modal version of the ontological argument’, in M. Szatkowski (ed.), Ontological Proofs Today, Frankfurt: Ontos Verlag, pp. 179-91.
  • 2012c. ‘What is the source of our knowledge of modal truths?’, Mind, 121: 919-50.
  • 2012d. ‘Categorial predication’, Ratio, 25: 369-86.
  • 2012e. ‘Individuation, reference, and sortal terms’, in Perception, Realism, and the Problem of Reference, A. Raftopoulos, and P. Machamer (eds.), Cambridge: Cambridge University Press, pp. 123-41.
  • 2013a. Forms of Thought: A Study in Philosophical Logic, Cambridge: Cambridge University Press.
  • 2013b. Locke’s Essay Concerning Human Understanding, London and New York: Routledge.
  • 2013c. ‘Neo-Aristotelian metaphysics: A brief exposition and defense’, in Aristotle on Method and Metaphysics, E. Feser (ed.), Palgrave Macmillan.
  • 2014. ‘Why my body is not me: The Unity Argument for Emergentist Self-Body Dualism’, in Contemporary Dualism: A Defense, A. Lavazza and H. Robinson (eds.), New York: Routledge.
  • 2015. ‘Ontological dependence’, with Tuomas Tahko, The Stanford Encyclopedia of Philosophy.

b. Other References

  • Armstrong, D. M., 1983. What is a law of nature?, Cambridge: Cambridge University Press.
  • Bradley, F. H., 1893. Appearance and Reality, Oxford: Clarendon Press.
  • Fine, K. 1994. ‘Essence and Modality’, Philosophical Perspectives, 8:1–16.
  • Fine, K. 1995a. ‘Senses of Essence’, in Modality, Morality and Belief. Essays in Honor of Ruth Barcan Marcus, Sinnott-Armstrong, W. (ed.), Cambridge: Cambridge University Press, pp. 53–73.
  • Fine, K. 1995b. ‘The Logic of Essence’, Journal of Philosophical Logic, 24:241–273.
  • Geach, P. T., 1980. Reference and Generality, Ithaca, NY: Cornell University Press.
  • Griffith, A. M. 2015. ‘Do Ontological Categories Exist?’, Metaphysica, 16 (1):25–35.
  • Hawley, K., 2001. How Things Persist, Oxford: Oxford University Press.
  • Heil, J. 2010. ‘Powerful Qualities’, in The Metaphysics of Powers: Their Grounding and Their Manifestations, A. Marmadoro (ed.), New York: Routledge.
  • Koslicki, K., 2012. ‘Essence, Necessity and Explanation’, in Contemporary Aristotelian Metaphysics, T. Tahko (ed.), Cambridge: Cambridge University Press.
  • Lewis, D. K., 1986. On the Plurality of Worlds. Oxford: Blackwell.
  • Lewis, D. K., 1988. ‘Rearrangement of Particles: Reply to Lowe’, Analysis, 48(2): 65–72.
  • Locke, J., 1975. An Essay Concerning Human Understanding, P. H. Nidditch (ed.), Oxford: Clarendon Press.
  • Miller, J. T. M., 2016. ‘The Non-existence of Ontological Categories: A defence of Lowe’, Metaphysica, 17(2): 163–176.
  • Moore, G. E., 1919. ‘External and Internal Relations’, Proceedings of the Aristotelian Society, 20: 40–62.
  • Morganti, M., and Tahko, T., 2017. ‘Moderately Naturalistic Metaphysics’, Synthese, 194(7): 2557–2580.
  • Mumford, S., and Tugby, M. (eds.), 2013. Metaphysics of Science, Oxford: Oxford University Press.
  • Novotný, D. D., and Novák, L. (eds.), 2014. Neo-Aristotelian Perspectives in Metaphysics, New York: Routledge.
  • Oderberg, D., 2007. Real Essentialism, London: Routledge.
  • Oderberg, D., 2011. ‘Essence and Properties’, Erkenntnis, 75: 85–111.
  • Schaffer, J., 2009. ‘On What Grounds What’, in Metametaphysics: New Essays on the Foundations of Ontology, D. Manley, D. J. Chalmers, and R. Wasserman (eds.), Oxford University Press.
  • Sider, T., 2001. Four-Dimensionalism: An Ontology of Persistence and Time, Oxford: Oxford University Press.
  • Smith, B., 1997. ‘Of Substances, Accidents and Universals: In Defence of a Constituent Ontology’, Philosophical Papers, 26:105–127.
  • Smith, B., 2005. ‘Against Fantology’, in Experience and Analysis, M. E. Reicher and J. C. Marek (eds.), Vienna: HPT and ÖBV.
  • Spinoza, B., 1955. On the Improvement of the Understanding, Ethics, Correspondence, trans. R. H. M. Elwes, New York: Dover.
  • Tahko, T. (ed.), 2012. Contemporary Aristotelian Metaphysics, Cambridge University Press.

 

Author Information

J. T. M. Miller
Email: jamiller@tcd.ie
Trinity College Dublin
Ireland

Explication

Explication is a method employed throughout philosophy and most sciences, as well as any cognitive endeavors which involve allocating concepts. It is also notably found in the sphere of law. Since explication is part and parcel of the traditionally philosophical subject of concept formation, philosophy is the main discipline to reflect on it extensively. Explication has sometimes been compared to a number of philosophical methods, such as logical (or conceptual) analysis, conceptual reduction, and conceptual engineering. Within a broad classification of kinds of analyses (Beaney 2014, sect. 1.1), explication is one kind of transformative analysis (as opposed to decompositional and regressive analysis; all three of them understood as analyses in a wide sense).

Historically, explication was most prominently described by Rudolf Carnap, according to whom “[t]he task of explication consists in transforming a given more or less inexact concept into an exact one […]. We call the given concept (or the term used for it) the explicandum, and the exact concept proposed to take the place of the first (or the term proposed for it) the explicatum.” (1950, 3) Carnap’s exposition remains the main reference point for scholars working on the topic of explication today. It is the most widely accepted general outline of the method of explication. As such, it allows for diverging interpretations in theoretical and procedural respects. This is demonstrated by the increase in research on explication around the turn of the 20th century.

A widely accepted instance of a simple and largely unproblematic explication is the 2006 definition of ‘planet’ by the International Astronomical Union (IAU). The discussion about that term was triggered by a number of discoveries of objects in orbit around the sun that are similar to the nine bodies that had until then been recognized as planets. Since there was no binding definition of ‘planet’ at that point, insecurity arose about whether to call certain objects planets. The IAU member assembly established a definition according to which a planet is “a celestial body that (a) is in orbit around the Sun, (b) has sufficient mass for its self-gravity to overcome rigid body forces so that it assumes a hydrostatic equilibrium (nearly round) shape, and (c) has cleared the neighbourhood around its orbit” (IAU 2006). This disqualified Pluto as a planet, whereas the other eight planets kept their status; and to a large degree the new understanding of the term ‘planet’ incorporated key aspects of the earlier use patterns, while at the same time being much clearer (cf. Murzi 2007).

This article provides a procedural account of explication outlining each step that is part of the overall explicative effort (2). It is prefaced by a summary of the historical development of the method (1). The latter part of the article includes a rough structural theory of explication (3) and a detailed presentation of an examplary explication taken from the history of philosophy and the foundations of mathematics (4).

Table of Contents

  1. History of the Explicative Method
    1. Pre-Analytic Reflections on Explication
    2. The Analytic Classics
    3. Recent Developments
  2. Procedure
    1. Framework
    2. Preparation of an Explicative Introduction
    3. Introducing an Explicatum
    4. Assessing Adequacy
  3. Explication Theory
    1. Constituents of Explication
    2. The Analysis Paradox
  4. An Exemplary Explication
  5. References and Further Reading

1. History of the Explicative Method

Explication is often seen as a central method in philosophy, “an activity to which philosophers are given, and scientists also in their more philosophical moments” (Quine 1951, 25). As such, it has itself become a subject of philosophy. Philosophers are therefore in the business of both performing explications and reflecting on explications. Often, either activity gives rise to the other. Explication and similar methods have been continually employed throughout all of Western philosophy since Socrates/Plato, though under other names, such as ‘definition’, ‘concept-formation’, ‘characterization’, ‘description’, and many others. In the context of definition, the term ‘explication’ was already used by Locke (1997, 378, §III.IV.8). Recent debates are ultimately caused by the methodological character of the writings of logical empiricists. Since the late 20th century revival of metaphysics and the investigation into the history of analytic philosophy, metaphilosophical questions have (re)surfaced repeatedly. Ideas of authors like Rudolf Carnap, including his conception of explication, are being revisited. This renewed activity brings with it debates about new and old issues of explication. Further investigations into the precursors of explication are to be expected in the future.

Accordingly, at least three phases of engagement with explication can be distinguished in the history of philosophy: pre-analytic reflections (1.a), the analytic classics (1.b), and recent developments (1.c).

a. Pre-Analytic Reflections on Explication

The full procedure of explication comprises both analysis of given meanings and, based on that, the stipulation of new meaning. Investigations into acts of describing or prescribing meaning can therefore be conceived as contributions to the methodology of explication in a wide sense. In a narrow sense, only considerations that relate analysis of meaning to stipulation of meaning in a specific fashion qualify as relevant to explicative methodology. In the wide sense, all comments on analysis and stipulation of meaning—by, for example, Socrates/Plato, Aristotle, Thomas Aquinas, Locke, Leibniz, and Mill—deserve to be mentioned in order to fully represent the pre-analytic history of explication. The summary following in the sequel relies on a selection: The conception of Johann Heinrich Lambert, who contributed to the methodology of explication in the narrow sense (without using the term ‘explication’), will be described. Furthermore, Immanuel Kant’s theory will be recapitulated briefly. The assessment of some commentators will be viewed critically, including that Kant provided a relevant and even distinguished contribution.

Johann Heinrich Lambert was a mathematician, scientist, and philosopher highly regarded by Kant. Lambert contributed to a methodology of pre-explicative procedures in the preface to his book “Architectonic” (1771, VI-VII), where the procedures proposed are applied to metaphysical concepts. Lambert envisioned five pre-explicative measures: (i) Many expressions, especially philosophical ones, are highly ambiguous. This needs to be addressed in an analysis of ambiguity. (ii) Many, especially philosophical, expressions have multiple synonyms. They have to be named in an analysis of synonymy. (iii) The terroir analysis, as applied to predicates, identifies examples, counterexamples, sub- and superpredicates, neighboring predicates, etc. It is directed at clarifying the surrounding field of concepts (somewhat anticipating connective analysis; see Strawson 1992, ch. 2). (iv) The final analysis enquires into the purposes that are associated with the use of given expressions. (v) The diachronic analysis reviews the history of the use and characterization of explicanda. The order of these analyses and the interrelations between their results vary on a case-to-case basis (Siegwart 2007a, 109-112). Lambert merits special mention as his remarks predate what is usually considered the starting point of the systematic treatment of explication (Carnap 1950) by over 170 years. He assembles several of the main components of explication preparation (see sect. 2.b) in one paragraph with a clear perspective toward purpose-oriented definitions, although he does not discuss the stipulative step succeeding the preparation. To Lambert, the analysis of the preexisting concepts is never the goal, however, it is a prerequisite for the explicative characterization.

Circumstances are different with Kant’s role in explication theory. In justifying his explication terminology, Carnap (1950, 3) superficially referred to Kant and Husserl (Boniolo 2003, 293-294). In recent years, it has been debated how accurate this ascription of parentage is. However, some authors (Boniolo 2003, 294-295) allege that Kant had a conception of explication superior to Carnap. In order to represent these issues faithfully, one has to pay attention to the essential synopsis regarding explanation of meaning and definition (in a wide sense) in the “Critique of Pure Reason” (Kant 1998, A 712-738/B 740-766, especially A 727-732/B 755-760). Kant started with two distinctions: (i) given and (ii) originally (to be) made concepts; (a) a priori and (b) a posteriori (or: empirical) concepts. According to Kant’s unusual terminology, explications are analyses of empirical concepts (i-b), exemplified by (the analysis of) the concept of gold. Expositions are analyses (in the sense of decompositions) or clarifications of a priori given concepts (i-a), such as the concept of substance. This is characteristic for philosophy, especially for metaphysics. Declarations create new empirical concepts (ii-b), like the concept of a chronometer. Definitions in a narrow sense create non-empirical, a priori concepts (ii-a), as in the concept of a circle. They are relevant for mathematics, but inappropriate for philosophy. The two latter kinds of concept formation are purely novative and therefore cannot be regarded as explications in Carnap’s sense. The two former kinds are analytic, according to Kant. Anyway, they seem to be limited to detection and description of meanings. Unless the characterization of these activities additionally admits a finalizing prescriptive element or at least a method of concept analysis reared toward subsequent stipulative definition, Kant did not employ a concept of explication akin to the one treated below (sect. 2).

b. The Analytic Classics

Current conscious acts of explication, as well as reflections and disputes about explication, usually stand in a methodic tradition that was started by Rudolf Carnap. He viewed Menger’s explication of ‘dimension’ (Menger 1943) as a prototypical explication. Carnap’s seminal text on the topic (1950, 1-18, ch. I) is a methodological introduction to an investigation into probability. The exposition is self-contained and does not rely on his views on probability. Earlier, explicit reference by Carnap to the method of explication can be found in (1947, 7-8, §2). In 1928, he already referenced the method of rational reconstruction, which is often likened to explication (2003, 158, §100 and v, preface to the second edition; see Beaney 2004, 125-128). Carnap named Husserl and Kant as sources of terminological inspiration, but Frege seems to have been a significant influence in this respect as well (Beaney 2004; Lavers 2013). Research on who influenced Carnap in what way is ongoing. At any rate, Frege did not propose any method of explication that, regarding conciseness, comes close to Carnap’s exposition. (But, see Frege 1969, 224, 261-262.  According to Blanchette 2012, 78: “Frege nowhere says what an adequate analysis is.”)

Carnap characterizes explication as “the transformation of an inexact, prescientific concept, the explicandum, into a new exact concept, the explicatum.” (1950, 3, §2) Though in a successful explication the explicatum is exact, Carnap does not fail to notice that the inexactness of the explicandum means that an explication cannot be said to be correct or incorrect in the sense that the explicatum exactly captures the explicandum. In order to work toward an exact explicatum, Carnap envisaged informal explicandum clarification as the first step in an explication. “[W]e must […] do all we can to make at least practically clear what is meant as the explicandum. […] Even though the terms in question are unsystematic, inexact terms, there are means for reaching a relatively good mutual understanding as to their intended meaning.” (1950, 4, §2) This involves distinguishing multiple meanings in the term associated with the explicandum (‘true’, not as used in carpentry, but as used in household language), giving examples and counterexamples (‘true’, not as in ‘a true democracy’), and naming synonyms (‘true’ in the sense of ‘accurate’). It should be noted that the clarification of the explicandum was very important to Carnap, but in (1950) a methodology of preparation of the explicative act is less explicit than in Lambert (1771). Carnap’s presentation of explication is primarily oriented toward that explicative act and its qualification with regard to certain desiderata.

Accordingly, in §3 (1950, 5-8) Carnap proceeds by naming four requirements an explicatum shall fulfill: (i) similarity to the explicandum, (ii) exactness, (iii) fruitfulness, and (iv) simplicity. All of the four requirements are partly, but not fully, explained by Carnap (see Boniolo 2003, 291-292, ch. II) and have therefore all spawned a varying amount of debate in specialized literature. – The similarity of the explicatum to the explicandum has been understood as either an overlap between the extensions of both concepts or an isomorphism between both extensions (cf. Brun 2017, sect. 4-5). Due to the different readings and the elusive nature of the explicandum, the similarity requirement is often seen as the most controversial one. (Beaney 2004, 139) In addition, occasional talk of the explicatum replacing the explicandum exerts some pressure on the similarity requirement, since things that replace one another usually have to be similar in certain respects in order to be replaceable. With regard to Carnap’s exposition, the question remains—in what respects an explicandum and an explicatum need to be similar for the latter to replace the former.

The other three requirements do not refer to the explicandum, but only to the explicatum. They are therefore not specific to a given explication. In an intuitive sense, exactness, fruitfulness, and simplicity are signs of quality for any introduction of a term. Carnap did not specify what he means by ‘exactness’, but he sympathized with the comparative concept of precision by Arne Naess. Naess said, roughly, the expression U is more precise than the expression T if and only if the set of admissible interpretations for T is a non-empty proper subset of the set of admissible interpretations for U. (Carnap 1950, 8, §3; Naess 1953, 60; on Naess’s notion of interpretation: 1953, 41-51) After Carnap, exactness of the explicatum is often viewed as either lack of vagueness or adherence to standards of formal concept formation. More specifically, fruitfulness of the explicatum is explained as figuring in a high number of universal statements. Carnap distinguished between empirical laws for non-logical and theorems for logical explicata. (1950, 6-7, §3) He did not expand on the problem of individuating and counting said statements in the face of the trivial or minor modifications that can replicate any such statement infinite times. Simplicity is presented as being subordinate to the other requirements (1950, 7, §3). It seems to refer to a low degree of syntactical complexity of the explicative definition (or other means of explicatum introduction) and, possibly, to the acceptance and handiness of the concepts and terms that are employed to introduce the explicatum.

In the two subsequent sections, Carnap gives a number of examples and distinguishes different kinds of concepts that are conceivable as explicata. This has primarily illustrative merit to the procedure of explication. However, in the last section of his exposition (1950, 15-18, §6), Carnap adds another aspect to explication that was important to him, namely interpretation. In his view, explication is a form of formalization which is the first part of the axiomatic method to establish a formal scientific theory. The second part is the interpretation of the axioms or postulates, which determine the interpretation of the definitions as well. This was a major programmatic issue for Carnap since the 1940s that was important to him independently from the method of explication. In the literature on explication, interpretation plays only a minor role.

Together with some of his other methodological publications, Carnap’s exposition influenced other scholars in one way or the other. Most notable are Nelson Goodman, Carl Gustav Hempel, Willard Van Orman Quine, and Peter Strawson. Hempel included the method into his Fundamentals of Concept Formation (1952, pt. I) where he distinguishes different ways to endow meaning. Among other things, analysis of meaning (cf. 2.b below) is described in detail. Explications considered as real definitions (in a very wide sense) he characterizes in continuity with Carnap: “Explication is concerned with expressions whose meaning in conversational language or even in scientific discourse is more or less vague […] and aims at giving those expressions a new and precisely determined meaning, so as to render them more suitable for clear and rigorous discourse on the subject matter at hand” (1952, 11). Hempel points out that the interplay between analyzed meaning on the one hand and systematic discursive interests on the other calls for a “judicious synthesis” (ibid.) of both.

Quine most notably discussed explication with reference to two set theoretic conceptions of the term ‘ordered pair’. (1960, 257-262, §53) He developed the method while discussing ontological issues related to that expression and then took the method to be representative for what happens in philosophical analysis and explication. In Quine’s own view (1960, 259, fn. 4), he followed Carnap, although he pragmatically converted the requirement of similarity: Inspired by the use of the explicandum, explicators associate the explicatum (Quine: “explicans”, see, for example, Reichenbach 1951, 49) with functional criteria of adequacy that are then supposed to be satisfied by it. In the case of the ordered pair there is just one such criterion: ∀xyzw [(x, y)=(z, w) → x=z ^ y=w]. It is satisfied by any adequate definition of the ordered pair, for example by the one presented by Kuratowski: ∀xy [(x, y)={{x}, {x, y}}]. The choice of criteria is open to those who perform explications and is guided by systematic interests that are associated with the actual use of the explicandum. For instance, the specific criterion for the ordered pair proves important when defining ‘relation’ by means of ‘ordered pair’. “We have, to begin with, an expression or form of expression that is somehow troublesome. […] But also it serves certain purposes that are not to be abandoned. Then we find a way of accomplishing those same purposes through other channels, using other and less troublesome forms of expression.” (Quine 1960, 260, §53)

Compared to Carnap (1950), these criteria of adequacy are new. Later, Carnap presupposed the Quinean understanding of adequacy in his reply to Strawson (Carnap 1963, 939). At any rate, the pragmatic outlook with its rather liberal selection of functional criteria of adequacy is more pronounced in Quine than in Carnap. (Within Quine’s own writing, the characterization of explication in (1960) is already a liberalization. According to (1951, 25, ch. II), explication relies to a higher degree on previous usage of the explicandum, which is meant to be “preserved” through the explication. This was part of Quine’s critique of synonymy relations, which according to him are presupposed by explicators, too.)

Strawson’s contribution to the method of explication (1963; see Pinder 2017b) is critical in nature. Starting with a rather strong reading of Carnap’s approach, in which the explicatum “replaces” the explicandum (Carnap 1950, 3), Strawson maintains that replacing a non-scientific concept with a scientific concept cannot be done without distorting the locutions that employ the original concept(s). “[T]ypical philosophical problems about the concepts used in non-scientific discourse cannot be solved by laying down rules of use of exact and fruitful concepts in science. To do this last is not to solve the typical philosophical problem, but to change the subject.” (1963, 506) In effect, explication prevents philosophers from dealing with their original problems that arise in a context of unexplicated concepts. The criticism that explication is “changing the subject” is still discussed among explication scholars (1.c). It is related to one version of the paradox of analysis (3.b). In a subsequent section of his paper, Strawson drops the strong reading of replacement, but maintains that in order to show the worth of an explication, one has to somehow relate the pre-explicative framework to the post-explicative one (1963, 510-514). This, to some extent, still requires an exact grasp of the pre-explicative conceptual situation. But if an exact grasp of the explicandum is possible, what does one need an explicatum (and an explication) for? (See Carnap’s reply 1963, 933-940.)

Goodman (see Cohnitz and Rossberg 2006, ch. 3) can be credited for two contributions: First, he elaborated on the similarity criterion of adequacy by proposing a criterion of “extensional isomorphism” in his 1951 Structure of Appearance (1966, 13-22, §I,3). This is a relation between, on the one hand, a set of explicanda and their semantical interrelations and, on the other hand, a set of explicata and their semantical interrelations. The criterion is different from both Carnap’s extensional overlap and Quine’s functional criteria of adequacy. The second contribution is derived from the first one, as the criterion of extensional isomorphism is usually directed at systems of expressions or concepts instead of individual ones (see Brun 2016, 1235-1236). Carnap recognized Goodman’s proposal of extensional isomorphism, but he deferred the choice of any similarity criterion to the specific explicative scenario at hand (1963, 945-946).

An outline of the development of explication in the analytic classics would be incomplete without a reference to the definition of truth by Alfred Tarski (1944; 1956; 1969; Hodges 2014). Although he does not employ the term ‘explication’, Tarski’s own depiction of his endeavors suggests classifying them as such. With regard to the explanation of meaning, for example, he distinguishes between (a) an account of the actual use of the term and (b) a normative suggestion that the term be used in some definite way. He attributes a mixed character to his own project: “What will be offered can be treated in principle as a suggestion for a definite way of using the term ‘true’, but the offering will be accompanied by the belief that it is in agreement with the prevailing use of this term in everyday language”. (Tarski 1969, 63) The so-called semantical explication of truth deserves the attention of explication theorists because it allows for all explication components to be identified with ease. Also, the core steps in the procedure are effortlessly traceable (Greimann 2007, 263). This especially applies to the criteria of explicative adequacy and the explicatum language. The criteria are condensed in convention T (Tarski 1956, 187-188). The remarks on—and the expressive power of—the explicatum language are associated with the distinction between object and metalanguage and, as a result, with the prevention of the semantical antinomies.

c. Recent Developments

Within recent discussions on explication, the mainly systematic contributions have to be distinguished from the mainly historical ones. In addition, there are numerous systematic publications that include significant exegetical sections. However, many exegetical investigations into the writings of authors referenced in the preceding section serve the purpose of correctly ascribing positions and changes of mind to pioneers of analytic philosophy and of identifying historical influences between them (such as Beaney 2004; Boniolo 2003; Carus 2007; Creath 2012; Floyd 2012). This section concerns only those contributions that are predominantly systematic.

Publications that contribute to the method of explication in systematic respects exist in a continuum with the works of Carnap, Quine, Goodman, Strawson and others. This is not true of the historical research on explication, which is a rather new phenomenon. Beginning in the 1960s, several scholars developed, criticized, and defended explication, sometimes enriching the discussion considerably. However, except for the four philosophers mentioned above, few scholars who worked on explication before 2000 are referenced in discourse after 2000. The renewed interest in explication is related to, and partly caused by, the revival of metaphysics toward the end of the 20th century and the subsequent rise of metametaphysics. Because of this historical situation, most of the recent investigations into explication consider its main field of application to be metaphysics and ontology. However, this is too narrow a view as evidenced in the settings within which some of the classic authors raise the issue of explication: such as Carnap (1950; philosophy of science), Quine (1960; set theory). In addition, many textbook examples lie outside metaphysics (Tarski’s explication of truth; 1956) or outside philosophy (IAU’s planet concept). Applications of explication in practical philosophy or in the special sciences are underrepresented within the research on explication (but see Hahn 2013, 34-53).

An example of the continuity between the pioneering efforts in explication methodology and later research is Hanna (1968). Hanna presumably was the first to develop an explicit procedural account of explication, which consists of five steps. (An incomplete procedural account is given by Naess (1953, 82-84).) Tillman (1965; 1967) transformed Strawson’s well known critique (see 1.b) into the method of “linguistic portrayal”, which can be seen as one step in a method of explication (2.b). Martin (1973) was the first to dedicate an entire paper to the explication of whole theories (systems of concepts, for example), as opposed to single concepts or expressions.

These examples can be seen as prototypes of some of roughly five very common kinds of systematic investigations into explication. (i) A number of contributions attempt, like Hanna, to establish a procedure of explication (Brun 2016; Greimann 2007; Siegwart 1997a) or to provide a formal theory of explication (Cordes 2017). It has to be noted that there is no canonical form of explication and that the various approaches are not directly compatible. A single procedure of explication that is both widely accepted and more detailed than the Carnapian exposition (1950) has yet to be devised.

(ii) Other scholars have focused, like Tillman, on single steps within a procedure of explication. Either widely accepted steps are spelled out, or new steps are proposed. Some mixed forms occur as well, like the explicandum clarification procedure by Shepherd and Justus (2015; Justus 2012; Pinder 2017a; Schupbach 2017), who in that context introduce experimental philosophy to explication. (For further examples see sect. 2 below.)

(iii) Conversely, others left the confines of isolated explications in favor of theorizing about the interrelations between multiple explications (see 3.a). This line of research on what might be called superexplicative structures continues Martin’s efforts. Brun (2017) refers to Martin’s emphasis on whole theories while trying to unify Carnap’s explication and Goodman’s reflective equilibrium. Brun’s final procedure can be seen as consisting of, among other components, multiple explications. Meanwhile, Siegwart considers chains of explications and types of disputes arising from rivalling explications (1997b, 263-265).

(iv) Another group of articles critically discusses various aspects of explication or the overall method without intending to enhance it. Strawson’s contributions were some of the first of this kind, and the various takes on the paradox of analysis and the changing-the-subject objection are often a centerpiece of both critical and sympathetic discussions. More recently, Maher’s express defense of explication (2007) has been widely noted. Reck (2012) exemplifies the kind of articles that are rather critical of explication. Both authors, like many others, base their assessments directly and only on Carnap’s conception of explication.

(v) Finally, there are those contributions that either relate explication to other methods or distinguish different kinds of explication than found in analytical classics and which had not been differentiated. Radnitzky (1989) wants to improve on Carnap’s notion of explication by transferring it from logical empiricism to the framework of Popper’s evolutionary methodology: Successful explications are exclusively considered to be a byproduct of processes of theory enhancement or theory replacement. Equally critical of Carnap, but inspired by Gaston Bachelard, Ibarra and Mormann (1992) regard the explication of a scientific concept (e.g. number or line) as its generalization in an explicative theory (number theory or geometry, respectively). Carus (2012) is concerned with two kinds of explication: local and global. Haslanger (2012, 376) distinguishes conceptual, descriptive, and ameliorative analysis, occasionally associating the latter with Carnap-Quinean explication (2012, 367). While all three kinds of analysis may—depending on their execution—have considerable overlap with explication, ameliorative analysis notably pushes the boundaries of explication because the “target concept” is picked for social or political reasons rather than, as was presupposed by the pioneers of explication, purely theoretical reasons.

The five categories are neither exhaustive nor mutually exclusive. For example, Brun (2017) was filed under (iii) and Floyd (2012) was seen as predominantly exegetical, but both could also reasonably be read as examples of category (v). A general theme throughout most recent systematic contributions to explication is that of the constantly evolving nature of languages and concepts (see Wilson 2006). Several scholars feel this nature needs to be considered more in explication—a method that is often seen as dealing with rather rigid notions of language, expressions, and concepts.

2. Procedure

This section provides a procedural account of explication. The first sub-section briefly situates explication within the field of methods of introduction (2.a). Then, three steps are distinguished within explication (Siegwart 1997a, 29): preparation (2.b), the act of explicative introduction (2.c), and postprocessing (2.d).

The act of explicative introduction is not the explication; it is only part of it. Thus, it is highly elliptical to talk about, say, an isolated definition as an explication. Preparation and postprocessing are integral parts as well. In preparation, the need for an explication is assessed, the explicandum is identified and situated within a context, and an explicatum within another context is chosen. Also, criteria of adequacy are established. Postprocessing primarily involves an assessment of adequacy.

Note that ‘explicandum’ and ‘explicatum’ are here taken to refer to expressions in order to avoid conflicting views on concepts. This is in accordance with Carnap, who admitted both concepts and terms as explicatum/explicandum (see the quote at the beginning of this article). Any theory and procedure of the term explication can be seen as concept explication once a suitable theory of concepts has been applied.

a. Framework

It is an everyday phenomenon in natural and artificial languages that we hear, read, or utter words and phrases without knowing their full meaning. We are unsure about their correct use or realize that we have used them in a way we would now consider to be questionable or even incorrect. In these situations, the meaning or use of the words may have to be fixed or entirely new words have to be fitted with novel patterns of use. Setting the meaning or use of a word in one of these ways is henceforth called introducing it. Introducing a word in this sense is to be distinguished from the didactic activity of teaching its correct use to someone. An introduction should entail the commitment and subsequent adherence to the meaning or use that was set. Thus, introduction is always stipulative (see Maher 2007, 336). What is usually called ‘reportive definition’ is not an introduction in this sense as no act that sets the meaning of an expression is being performed. Rather, the meaning that has been associated with the expression in question is being reported or its use is being described. The umbrella term ‘meaning clarification’ can be taken to refer to introduction and meaning description alike.

There are various methods and forms of introduction. Forms of introduction are, for example, definitions, axioms (or meaning postulates), and various types of metalinguistic rules. Acts of introduction are thus performed by, for example, setting definitions, setting axioms, or establishing metalinguistic rules that regulate the use of an expression. Each of these forms of introduction is compatible with explication as all of them are a means of providing an explicatum with a meaning or of determining how the explicatum is to be used correctly. In the literature on explication, the most frequent form of introduction, and sometimes the only one considered, are definitions.

When introducing an expression or a concept, there are many factors that may or may not be taken into account. Depending on what is to be taken into account, one follows a certain method of introduction. For example, previous usage of certain expressions may or may not be relevant to the act of introduction. More specifically, it is possible that there are expressions in use whose application is problematic (with respect to certain aims). If under such circumstances the introduction of expressions whose meaning or use is intended to more or less mirror the former expressions is attempted, then the method of introduction is called explicative or an explication. All other methods of introduction will be called novative. Novative introductions can be found, for example, in contexts of discovery. When, for example, a new plant species is discovered, it may receive any species predicate without the prior use of that expression being substantially relevant. For more on this topic, see Siegwart (1997a, 18-29, §§ 3-10).

b. Preparation of an Explicative Introduction

The preparation of an explicative introduction is intended to provide everything that is needed in order to perform introductions that can be evaluated according to standards of explication afterwards. Some preparatory steps are inevitable in order to yield an explication; at least the explicandum, the explicandum language, the explicatum, the explicatum language, and the criteria of explicative adequacy need to be identified in some way. Together with the explicative introduction (2.c), they constitute an explication in the sense that if one is missing, it can hardly be said that an explication has taken place. Therefore, they are constituents of an explication (3.a). But note, not all constituents need to be of a formal or explicit nature—this would be rather unusual at least for the side of the explicandum—and their identification may allow for some ambiguity. Thus, the minimal procedure of explication, which consists of the allocation of the five constituents (this sect.) and the introduction of the explicatum (the sixth constituent; 2.c), does not put any constraints on how to perform an explication.  It simply acknowledges what is needed in order to be able to speak of an explication at all (see Cordes 2017). More constructively: If, after some interpretive effort, one finds six items in a text that can be understood, respectively, as explicandum, explicandum language, explicatum, explicatum language, criteria of explicative adequacy, and explicative introduction, then, and only then, it seems reasonable to speak of an explication.

Other steps are not obligatory but help to give an explication that establishes an explicatum fit to the explicator’s locutionary purposes: posing an explicative question, assessing the need for an explication, treating ambiguity in the explicandum, considering synonyms of the explicandum, reviewing empirical research on the use of the explicandum, and reviewing the history of explication. In this sub-section, these steps are dealt with in a suitable order.

(i) Posing of an explicative question (optional): Quite frequently, explicative endeavors are prompted by a “what-is-x” question. What are qualia? What is a norm? What is truth? (see Carnap 1950, 4, §2; Audi 2015, 209) Questions of this kind help to invoke the field in which the explication takes place. By posing this question, the field is only roughly contoured so that indeterminacy still remains. When posing an explicative question, one should emphasize that it demands a characterization; examples or non-characterizing general statements (such as natural laws) are not the required answers. This emphasis can be achieved by using semantic vocabulary within the explicative question: What is the meaning of ‘beauty’? How is the term ‘rational’ to be used and understood?

(ii) Assessing the need for an explication (optional): Questions demanding characterizations may be satisfied with reportive answers. In that case, an explication may not be in order. If a questioner demands an explication, this demand may or may not be justified. Whether the need for an explication is real can be assessed by pointing to problems with the prior usage of the explicandum. Carnap names inexactness of the explicandum as the sole reason for performing an explication (1950, 4). The concept of (in)exactness in explication is frequently discussed (Reck 2012, 99-101) and seems to be inexact itself (in some intuitive sense). Hahn (2013, 36-42) investigates three specific reasons for performing an explication: ambiguity, vagueness, and semantic gaps. According to Hahn, each of these semantic defects may be innocuous (such as the ambiguity of ‘bank’) or risky (such as the vagueness of ‘medically necessary treatment’). However, a given semantic defect may be acknowledged but at the same time be desirable for non-cognitive communicative reasons. Constructive ambiguity in diplomacy is a case in point (see Pehar 2001). An expression in a context may have other defects that constitute causes for explication, such as emotive connotation, or it may have multiple defects. Naming them helps the explicative effort in establishing criteria of adequacy. Alternatively, there may be no inherent defect within prior usage, although the explicator envisions some specific discursive purpose which calls for adjustments in the use of certain expressions. Generally, any perceived relevant defect in an explicandum can be understood as a mismatch between its existing use patterns and the purposes to which one wants to use it.

(iii) Naming the explicandum (necessary): The proposal of an explication needs some expression that is supposed to be the explicandum. Posing an explicative question already involves using or mentioning that explicandum. But with further deliberation on the subject, one may want to choose a different, but related, expression as the explicandum. For example, starting with the question as to what beauty is, an explicator may determine ‘is beautiful’ as the explicandum. Although naming an explicandum is obligatory within an explication, it is possible to later retrace this step should a different expression turn out to be more suitable. In addition, there is the possibility of naming several expressions which are simultaneously explicated in one explicative endeavor. This may lead to one unifying explicatum or to several explicata, each of which are associated with one or more explicanda. Naming the explicandum is not the same as individuating the explicandum concept. If a pre-theoretic concept is considered to be the object of an explication, there are widely noted problems in individuating it, since its inexactness is presupposed by the usual understanding of ‘explication’ (Carnap 1950, 4, §2). The next four steps represent a limited remedy for that problem.

(iv) Naming the explicandum language (necessary): Since the explicandum is just an expression, it is not sufficient to serve as the basis of an explication which is supposed to build on prior usage. Thus, an explicandum language which includes the use patterns of the explicandum has to be identified. Associating some kind of use patterns with a language does not mean that this language has to be formal or even that the use patterns are known to the explicator. In fact, in this pre-explicative state, languages are often hard to individuate. For this reason, it suffices to identify a language by referring to the relevant context of use, such as a seminal text that employs the explicandum and is supposedly written in the explicandum language. Four examples of explicandum languages are contemporary philosopher’s English, the language of applied ethics, the language of the Vienna Circle debate about protocol statements, and the scientific jargon of some discipline, like physics. The most pertinent measure of what constitutes an acceptable identification of an explicandum language is the degree to which it encompasses those prior employments of the explicandum which the explicators deem to be the relevant reference points for their explicative project. For that purpose, it might be advisable to specify an explicandum theory if the distinction between theory and language is available. To give an example of this and the preceding step in one sentence: “Here I will explicate ‘velocity’ as it is used in Newtonian mechanics for engineering science.” (Carnap 1950, 3-5.)

(v) Treating ambiguity in the explicandum (optional): Risky ambiguity constitutes a reason for explication (see step (ii)). Different meanings or uses of expressions within the same context should be distinguished in an informal but systematic fashion (Siegwart 1997a, 31). This helps to clarify which of the meanings is the one that predominantly serves as the model for the later introduction of the explicatum. In some cases, multiple and systematically related but distinct meanings of the explicandum are supposed to serve as the model. In that case, it makes sense to develop an explication plan which involves multiple explicata that will be introduced in a systematic order. When, for example, explicating ‘supererogatory action’, the explicator will recognize a distinction between a type and a token understanding of actions in general. This may suggest that two explicata (‘supererogatory action type’ and ‘supererogatory action token’) are in order and two acts of explicative introduction and that one could be defined by the other. A classic Aristotelian example of disambiguation concerns ‘healthy’ in four interrelated senses: The term may be applied to things preserving health, things causing health, signs of health, and to things capable of health (Aristotle 1984, 1003a-b).

(vi) Considering synonyms of the explicandum (optional): In this step, synonyms of the explicandum are listed (Siegwart 1997a, 31). Since this is still a pre-explicative step, synonymy is taken in an intuitive sense. The list serves two purposes: (I) Synonyms may help to individuate the meaning that serves as the reference point for the explicative introduction. Thus, when explicating ‘argument’ in communication studies, one may list ‘dispute’ and ‘reason’ as two (partial) synonyms of ‘argument’ and then specify that this explication is not about ‘argument’ in the sense of ‘reason’. (II) The other purpose is to broaden the scope of explications. Some synonyms may be suitable as additional explicanda. This can lead to a refinement of the explication plan (see step (v)). Or, again, multiple explicanda can be merged into one explicatum; this then constitutes convergent explications (as in 3.a).

(vii) Reviewing empirical research on the use of the explicandum (optional): Shepherd and Justus (2015, sect. 3) argue that experimental philosophy can help with the preparation of an explication, especially with the individuation of the explicandum concept. According to them, surveys about the intuitive use of language help to uncover (I) vagueness, (II) ambiguity, (III) bias influencing intuition, (IV) non-biasing influences on intuition, and (V) central features of an explicandum concept. Data on these issues is relevant to what appears here as steps (ii) and (v). In addition, the experimental identification of central features of concepts can directly inform either the criteria of explicative adequacy or even the explicative introduction (Pinder 2017a, sect. 3). But explicators are not forced to follow the survey results in this respect. (Shepherd and Justus do not claim that either.) For example, when explicating conditional expressions of natural language, experimental philosophers may find that test subjects do not generally accept modus tollendo tollens. Explicators could still decide to put that inference pattern into the criteria of explicative adequacy in order to carve out a use of conditionals that does admit of modus tollendo tollens. Thus, experimental philosophy should primarily be seen as a heuristic in explication, an extension of the classical conceptual analysis method of contemplating and trying out acceptable and inacceptable uses of the explicandum. Similar to how explicators review results from experimental philosophy, they may also want to consider studies in corpus linguistics, but this field is not yet being investigated by scholars on explication.

(viii) Reviewing the explicative history of the explicandum (optional): Like the preceding steps, reviewing already existing explications of the same explicandum may serve heuristic purposes by contributing to the content of the current explication, as with regard to establishing criteria of adequacy (step (xi)). In addition, this step allows situating one’s own enterprise within the debate (Siegwart 1997a, 33-34). If there are no prior explications of the explicandum, it may help explicators to be aware of the potential paradigm function that their own explication may or may not serve (ius primae explicationis). On the other hand, if there are prior explications and relevant debates, explicators can point out similarities and dissimilarities and—if so disposed—may add critical remarks, thus partaking in explicative debates. Even without critical remarks, explicators should explain why they do not accede to existing explications (see 3.a).

(ix) Naming the explicatum language (necessary): Explicatum languages should be chosen carefully. Potential users of explicata and addressees of explications should be taken into account. It may help to itemize languages in a systematic fashion. If, for example, the explicatum language comprises reference to parts and wholes, one should consider both mereological and set theoretic languages. At any rate, the explicatum language should be one that is accessible, that is, one that is either intuitively useable in a correct fashion or one associated with available introductory literature. If necessary, explicators themselves have to give an introduction to the languages or even construct them ab initio. That does not imply that explicatum languages have to be formal, although this might be the case quite frequently. An explicatum language can also be “a more exact part of” the explicandum language (Carnap 1963, 935). If the explicatum language is informal, the individuation strategies are similar to those for explicandum languages (see step (iv)). Note that if the form of introduction later employed needs a background theory, then such a theory should be named in step (ix) as an explicatum theory within the explicatum language.

(x) Naming the explicatum (necessary): This step is as intricate as the preceding one. First, explicators must realize that there are several syntactical categories from which to choose in most explicatum languages. Thus, when explicating ‘beauty’, one may decide to choose a nominal expression or a predicative one as the explicatum. Both options entail several subordinate options. If the explicator decides for a predicate as the explicatum, the question of arity arises, which is connected to the question of what kind of operanda are acceptable for each place of the predicate. This is true of both formal and informal languages. Second, certain options may be excluded for various reasons. Semantical intuitions or plans about the post-explicative semantical relations between some expressions are relevant to determine which expressions are potential explicata, but since this itemization is only about expressions, none of the semantic relations is binding. It should be taken into account that prospective users of the explicata may desire explicata that can precisely describe complex relations and explicata that are simple and easy to use. Third, the explicatum or explicata must be explicitly named. The syntactical category and, if applicable, arity for each explicatum must be clear.

(xi) Naming the criteria of adequacy (necessary): The criteria of adequacy constitute the explications’ measure of success. Usually, it is a number of propositions involving the explicata that are supposed to be proven true after the explicative introduction has been performed (see Quine 1960, 257-266, §§53-54). Explicators may arrive at these propositions with the help of steps (ii) and (v) to (viii), in which various senses have been distinguished and conceptual interrelations have been scrutinized. It is advisable to assemble preliminary criteria of adequacy as a kind of wish list that is still formulated in the explicandum language. Eventually, however, conflicting criteria have to be eliminated and all criteria must be formulated in the chosen explicatum language so as to yield a demonstrably successful explication (see 2.d). Because the criteria of adequacy will often derive from pre-explicative intuitions that are associated with the explicandum, they can be seen as codifying Carnap’s requirement of similarity between explicandum concept and explicatum concept (Quine 1960, §53). Traditionally, the criteria of adequacy are thought of as explicatum language propositions that are true in the given explicatum language. However, sometimes it is important to explicators that certain propositions do not follow from the explicative introduction and that their negations do not follow either. For example, agnostic explicators of the concept of god may want to avoid a scenario which allows them to derive god’s omnipotence, or lack thereof, from the introduction of the explicatum. Formulating this kind of undecidability as a criterion of adequacy turns these criteria into partially metalinguistic propositions. Explication should allow for this kind of metalinguistic criteria because they allow us to formulate the other Carnapian requirements as criteria of adequacy. Thus, from an explication of ‘planet’ one may require as per two criteria of adequacy that (I) planets are celestial bodies and that (II) planethood implies that most planetary laws discovered before 1990 still hold. The second criterion is of a metalinguistic nature and codifies Carnap’s fruitfulness requirement (1.b).

c. Introducing an Explicatum

While preparing an explication, its explicandum, explicandum language, explicatum, explicatum language, and its criteria of explicative adequacy have to be named (2.b). The first two constituents represent the connection to a preceding language practice and affect the specification of the latter three constituents. However, only these are relevant to the introduction of the explicatum. The explicatum is introduced in the explicatum language in a way that supports the criteria of adequacy. The result of the act of introduction, such as a definitional proposition, is the sixth constituent of an explication. (see 3.a)

In order to introduce explicata in accordance with a certain form of introduction (definitions, meaning postulates/axioms, metalinguistic rules), that form of introduction needs to be available for the explicatum language. Depending on the form of introduction, further requirements must be met—for example, some definition rules are formulated with reference to theories which would have to be provided before performing the act of explicative introduction. All this applies to formal and to informal explicatum languages though informal languages often do not explicitly determine whether certain forms of introduction are available and what relevant background theories are. Explicators should keep this in mind while preparing the explication (2.b). There are at least three typical forms of introduction that are usually accepted in both formal and informal settings:

(i) Definition: Here, definitions are understood as constituting a form of introduction that is performed within (explicatum) languages. Thus, definitions are not metalinguistic. In formal languages, defining is like inferring in that it is governed by rules that can be stated in a suitable metalanguage. A considerable number of definition rules have been developed for formal languages (Suppes 1957, ch. 8). The definition of predicates by generalized biconditionals and the definition of individual constants and function constants by (generalized) identities are quite common. An example of the latter has been provided above for the ordered pair function constant ‘(.., ..)’ (1.b). It requires that in the explicandum language, or theory, the set theoretic symbols and some logical constants are already introduced. An example of a definition of predicates applied to an informal language is represented thus: any b is a piece of knowledge if and only if b is a belief and b is true and b is justified. Setting this proposition as a definition presupposes that the predicates have a definite meaning relative to the explicandum language, either by introduction or by implicit convention. Thus, when explicating the explicandum ‘knowledge’ by the explicatum ‘is a piece of knowledge’ through the definition just provided, explicators should settle on a language that (I) allows for this kind of definition and (II) provides the expressions occurring in the definiens, including their meaning.

(ii) Meaning postulate/axiom: Setting axioms or, by another terminology, meaning postulates is performed within languages as well. This excludes axiom schemes which are metalinguistic devices for describing classes of object language axioms (see (iii), below). Again, being an object language form of introduction, axioms can be seen as governed by metalinguistic rules. These rules usually call for consistency, or prima facie consistency, so that the resulting language/theory or parts of language/theory are not rendered trivializable or even inconsistent. Traditionally, axioms are thought to be true or evident, though it is not always clear according to what measures they could be judged thus. Alternatively, and instrumentally, setting axioms can itself be seen as an act of qualifying propositions as true. At any rate, while definitions have a specific syntactic form (see above) which guarantees non-creativity and eliminability, axioms are less regulated syntactically and usually allow for creativity in the sense of establishing new truths in a language (or theory). Axioms come into play when expressions cannot be defined but are seen as basic concepts which nonetheless need regulation. The various axiomatizations of set theory can be seen as explications of ‘belongs to’ or ‘is an element in’. In such an explication, the explicative introduction consists of several acts with each setting one axiom. An example of an informal explication employing axiomatic introduction is the juridical explication of ‘person’ with regard to corporations that states that in a number of respects, corporations have personhood. (‘.. has personhood’ can be seen as the explicatum.)

(iii) Metalinguistic rule: Axioms are used to introduce the basic expressions of a language. However, this is only true if logical and auxiliary expressions are disregarded. Frequently, these are not regulated by axioms but by metalinguistic rules, including rules for setting axioms and definitions as well as rules of inference and assumption. This illustrates that metalinguistic rules are yet another means of giving meaning to an expression and can be used for acts of explicative introduction. Therefore, establishing the rules of hypothetical derivation (conditional introduction) and modus ponendo ponens (conditional elimination) in a language may be seen as an act of explicative introduction which explicates the expressions ‘if … then …’ and ‘… provided …’ of ordinary language. Explication by metalinguistic rules also makes sense if the means of the object language are not sufficient to endow the explicatum with the full intended meaning. In mineralogy, hardness (explicatum) can be regulated by an operational rule that gives practical instructions to be followed, depending on the outcome of which propositions on comparative hardness may be constated. Metalinguistic rules, like axioms, are rather flexible. Usually, the only (meta-meta-)rule to be followed is that the metalinguistic rules shall not lead to any kind of prescriptive dilemma with respect to what moves they allow or forbid in their corresponding object language (but also see Prior 1960 for tonk style rules to be avoided).

The forms of introduction described here are not exhaustive. For example, forms of mixed introduction that combine metalinguistic rules and object language axioms are easily conceived. When explicating ‘heavier’ for a physical theory, the setting of axioms of irreflexivity and transitivity can be combined with an operational rule directing the use of a beam scale that leads to atomic propositions about one body being heavier than another (see Siegwart 2007b, 52-56).

d. Assessing Adequacy

After the explicatum has been introduced, the explication is commonly assessed. This can be done by verifying the criteria of explicative adequacy which have been established in the last step of the explicative preparation (2.b). A straightforward test of adequacy consists in verifying all these criteria in the explicatum language with the help of the explicative introductions performed in the preceding step (2.c) and, possibly, with the help of an explicatum theory provided beforehand (Cordes 2017). A positive test result qualifies the explication as adequate or successful. If any of the criteria is disproven in the test of adequacy, the explication is inadequate or has failed. A full test of adequacy is not always feasible. Sometimes, some criteria of adequacy (for example, metalinguistic criteria) cannot be decided by the resources available or are even undecidable. If at least all object language criteria of adequacy are proven in the test of adequacy, one may speak of a consequentially adequate explication. In any case, if there are criteria of adequacy that have neither been proven nor disproven and if all other criteria have been proven, the explication is adequate on probation. (Cordes 2016, 38)

As the criteria of adequacy constitute the salient benchmark for explications, a bad performance in tests of adequacy should motivate revisions. Accordingly, it is common for explicators to go back and forth and tweak any of the constituents of the explication. Rarely is the explicandum or the explicandum language changed in order to get the desired positive test results. Changing the explicatum, the explicatum language (or theory), the criteria of explicative adequacy, or the explicative introduction may all influence the test. Note that the to and fro of the revision process is usually omitted when an explication is presented within a publication in order to present only the final explication.

Besides the internal test of adequacy, external measures of explicative or introductive quality are conceivable. (As explicative adequacy is an internal quality measure, confusion is avoided by not attributing adequacy to explications in an external sense. Still, Stein (1992, 280) gives an externalistic view on explicative adequacy.) Carnap’s requirements of precision, fruitfulness and simplicity are generic, and thus external, criteria of introductive quality (which can be specified in various ways (cf. 1.b)). The choice of any of the following may cause external criticism despite a positive result of the internal test of adequacy: explicandum, explicandum language (and theory), explicatum, explicatum language (and theory), criteria of adequacy, explicative introduction.

One easy way of obtaining a successful explication is to employ the criteria of explicative adequacy as the explicative introduction, often as axioms/meaning postulates. This, however, trivializes the verification of the criteria. In some cases that is acceptable, but often it can be criticized on grounds of lacking fruitfulness because nothing that exceeds the criteria of adequacy will follow and a non-creative definition securing eliminability may be preferable over the ad hoc postulation of the criteria.

External assessments of explications include comparisons to other explications. Different explications may vie with one another for various desiderata, like simplicity, conciseness of the criteria of adequacy, or syntactical parsimony. Comparative assessments of explications substantiate explicative debates (3.a). A thorough exegesis of the history of explications for a given explicandum in the preparatory steps (2.b, (viii)) is an important resource to rely on in these debates.

3. Explication Theory

The second section provided an outline of an explicative procedure. It can be consulted when performing explications. When reflecting on explications, a theory of explication is needed. Such a theory can be provided with varying degrees of precision and distinguishing potential. In what follows (3.a), a theory that falls short of the formal rigor of a set theory in which languages, expressions, and explications are set theoretic entities (Cordes 2017) is developed. The aim is to provide some intuitive terminology in order to distinguish different kinds of explications and to relate explications to one another. Reflections on explications in this sense are often important when explicative disputes emerge and when the whole method of explication is discussed. A second subsection can be seen as an application of the terminology and also as a treatment of the analysis paradox that is frequently revisited when the method of explication is discussed critically.

a. Constituents of Explication

As described above, explications can be conceived as made up out of six constituents: explicandum, explicandum language, explicatum, explicatum language, criteria of explicative adequacy, and the explicative introduction. If so conceived, providing an explication means specifying these six constituents (2.b, 2.c). Reflecting on an explication means describing and theorizing on any of these constituents, their interrelations, or their relations to other entities—other explications, explicators, and the speakers of explicandum and explicatum language, background theories, etc. General discussions and justifications of the method of explication are supported by its theory, which can take the form of, for example, the six-constituent conception presupposed here. Other conceptions can be provided as well, but such activities have not received much attention in the past.

(i) On occasion, explicators or recipients of explications may reflect on one or several constituents of an explication separately. Explicators do so, for example, when following a procedure of explication akin to the one elaborated in section 2. In a strict sense, explicators cease to do so as soon as they relate the different constituents to one another, for example by considering the suitability of a specific explicatum language with regard to the potential explicata it accommodates. Nonetheless, a lot of explicators’ work is indeed directed at single constituents, like individuating the explicandum language. The same holds for recipients of explications who may take issue with one individual constituent for reasons purely internal to that constituent. For example, with regard to a certain criterion of adequacy, recipients may want to criticize it as self-contradictory or they may want to voice concerns about its intelligibility quite apart from how it relates to the other aspects of the explication.

(ii) The reflection on relations between different constituents within one explication has one prime example within the procedure of explication, namely the adequacy test. It relates the explicative introduction to the explicatum language and to the criteria of explicative adequacy. Thus, qualifying an explication as adequate, inadequate, consequentially adequate, or adequate on probation (2.d) is a matter of reflecting on some relations between the components of an explication. There are other kinds of relations that belong in this category. For example, explicators may want to compare the explicandum language and the explicatum language with respect to which is stronger or more precise or whether they are sublanguages of one another. This suggests the distinction between intra-language explications, where explicandum language and explicatum language are the same, and inter-language explications, where they are not.

(iii) Transcending isolated explications, a theory of explications provides means to compare different explications to one another. Of special interest are explications where the explicative introductions are equivalent (convergent explications) and which either start with the same explicandum from the same explicandum language (explication alternatives, as opposed to genetically different explications) or yield the same explicatum from the same explicatum language.  It is then possible to distinguish linguistic alternatives (different explicatum languages), lexical alternatives (different explicata), criterial alternatives (different criteria of adequacy) and introductory alternatives (different explicative introductions) among explication alternatives. This systematic approach to explication alternatives provides a starting point to carve explicative disputes at their joints. It allows explicators to localize their dissent and develop a productive explicative debate.

Apart from explication alternatives, other phenomena are noteworthy: If an explication starts from the results of a previous explication, then one may speak of a consecutive explication. As soon as more than two explications are related to one another, more interesting structures emerge. A history of explication is just any number of explications that are mutual explication alternatives. In contrast, chains of explications are those sequences of explications where the earlier explications provide explicata that are used in the later explications to introduce the later explicata (see sect. 4). Usually, the earlier explicata just figure in the later explicata’s definietia. For more detailed distinctions see Cordes (2017, sect. 4).

(iv) Lastly, one can reflect on relations between explications on one hand and entities that are independent from explications on the other hand. For example, when several explications provide explicata in a common explicatum language, the result can be considered a theory. This theory can be analyzed in many respects that are different from its explicative genesis. Also, as pointed out in the latter part of 2.d, explications can be externally assessed. Such an assessment accounts for the three Carnapian requirements of exactness, fruitfulness, and simplicity. But in general, explicators and the recipients of explications are free to relate explications to whatever entities they see fit.

This final group includes any investigations into the social dimension of putting forward an explication and into how explicators interact either competitively or cooperatively with one another or with other agents. Under the six-constituents conception of explication, explicators are not part of their explication. Consequentially and illustratively, when a written text is with substantial exegetic effort, interpreted as an explication, then the six-constituent conception of explication says nothing about whether the distinctness of writer and interpreter or their double effort justifies speaking of two explications. An analogous case occurs when authors develop a predecessor’s systematic-explicative ideas on some subject. These two scenarios suggest that in the vicinity of explications, the possible interactions between agents are versatile. Literature on this social-conceptual dimension of explication is rare.

b. The Analysis Paradox

The analysis paradox is a prima facie cognitive dilemma that repeatedly surfaces in, among other areas, explication theory (Stein 1992, 280; Dutilh Novaes and Reck 2017)—sometimes without explicit mention (Justus 2012). Some scholars even view the paradox of analysis as a main motivator for the development of a method of explication by Carnap (Lavers 2013, 226) or by Frege (Beaney 2004, 120; see sect. 4 below). This is a debatable view, since neither philosopher referred to the resolution of the paradox as a goal of their respective efforts. At any rate, setting up and employing any method of explication can be clearly separated from theorizing about how it relates to the paradox of analysis.

The term ‘paradox of analysis’ was first used by Langford in reference to G. E. Moore (Langford 1942, 323; see Beaney 2014, who also applies the term ‘paradox of analysis’ to some earlier problems). There are several versions of the paradox, but the one that is relevant to explication can be put as the following dilemma: Either (i) an explicator fails because the result of the explication diverges from the explicandum, thus not explicating the intended concept, or (ii) an explicator fails because the explicatum concept is identical with the explicandum, thus not improving on the pre-explicative cognitive situation.

The paradox is often associated with an underlying conflict about the role of natural and artificial (not only formal) languages in philosophy. For explications outside philosophy, this specific issue is less pertinent. Thus, Strawson (1963, 515) characterizes philosophical problems as problems about concepts, with those concepts being conveyed by natural language. Carnapian explication includes an explicatum concept (often conveyed by artificial language) that is allowed to diverge from the explicandum concept. This divergence prompts Strawson to his widely recognized charge that an explication “change[s] the subject” (1963, 506). Strawson’s critique could be framed so that it is directed against Carnap’s perceived acceptance of the first horn of the dilemma, while, at that point, Strawson takes the second horn. Consequently, this causes him to articulate his methodology in a way that diverts the appearance of failure connected to that horn.

The criticism of explication as changing the subject is thus, on the one hand, an application of the paradox of analysis to explication and, on the other hand, a manifestation of a deeper methodological dispute about the subject of philosophical problems and the language in which they are framed. Facing this situation within philosophy, explication theorists have two ways to deal with it: (i) Accept Carnapian explication and accept a central role of artificial languages, or (ii) put the emphasis in philosophy on natural language and reformulate the method of explication in divergence from Carnap.

(i) For the first way to deal with the paradox, explication theory provides a clear and intuitive understanding of explicative failure (2.d), which is different from the (negligible) failure of not exactly capturing the unaltered pre-explicative use of a term. Thus, in this approach, the first horn of the dilemma is not seen as constituting failure. In a sense, the “changing of the subject” is recognized and may in fact be encouraged. If, for example, there are inconsistencies in the explicandum language which are seen as being caused by the meaning of the explicandum, then the explicatum should diverge from the original meaning. Accordingly, when characterizing explication, Löffler (2008, 14) talks about remedy (“entstören”) instead of replacement (“ersetzen”). This is a departure from Carnap and many other scholars who portray the explicandum as being replaced by the explicatum. For another line of reasoning in partial support of conceptual change within explication, (ameliorative analysis) see Haslanger (2012, 392-394).

(ii) The second way to deal with the dilemma chooses natural over artificial language. However, this alone does not fend off the complaint of “changing the subject”. Even within natural language, the sameness or the difference of explicandum and explicatum may be seen as a dilemma. Thus, even intra-ordinary-language explications are prone to the paradox of analysis if unchanging replacement is expected.

4. An Exemplary Explication

In retrospect, the history of philosophy provides scholars of explication with numerous examples, although the exact count depends on the concept of explication. Presupposing a weak understanding, nearly any conceptual clarification in a philosophical context may count as an explication. Olsson (2015), for example, considers the JTB conception of knowledge to be an example of an explication, which is the basis for his defense against objections along Gettier’s lines. He does not cite a specific text which counts as the locus explicationis (a text documenting the execution of the explication). Instead, he trusts that the genesis of the JTB conception conforms to his 2-step conception of explication (2015, 59). In this last section, Frege’s Foundations of Arithmetic (1953) are described through the lens of an explication theorist.  That is, the book is taken as an explication exemplifying the methodology from sect. 2 and some of the terminology from sect. 3. (For a deeper investigation into Frege’s project from a different point of view, see Blanchette 2012, especially ch. 4.)

Although Frege’s book The Foundations of Arithmetic is usually seen as an explication of ‘number’ (Lavers 2013, 225), he notably starts his investigation by posing an explicative question with regard to ‘one’ or ‘1’. Frege defines ‘one’ in the latter part of the book (§77) after defining ‘number’. Carnap (1963, 935), however, refers to the work as an explication of the numerical words ‘one’, ‘two’, etc. The book can be considered to contain not only one explication, but a chain of explications (3.a) with the explication of ‘number’ at the center. Roughly, part II of the book (§§18-28) predominantly deals with ‘number’ and part III (§§29-54) with ‘one’. The very first sentence of the book (Introduction, p. I) contains two versions of the explicative question regarding ‘1’: ‘What is the number one? What does the symbol 1 mean?’ The second version emphasizes Frege’s conceptual or explicative intention when posing the question. Subsequently, he clarifies that he is asking for a definition of that numeral. The explicative question with respect to ‘number’ is posed a few paragraphs later: What is number? (Introduction, p. II)

Frege is aware that he carries the burden of proof regarding the need for an explication (p. III, and explicitly on p. V: “a desire for a stricter enquiry”) and so points out some problems with the talk of numbers in the course of the introduction. Ambiguity (p. I) and contradiction (p. IV) are two of his concerns. Frege is aware of other conceptual proposals (such as a history of explications) and that the proposals compete with one another (p. V).

The explicanda are named numerous times throughout the book, for example in the explicative questions: ‘1’ and ‘number’. After the introduction, Frege focuses on ‘number’ to illustrate his general aims, which is why this explicandum is more prominent (§§2, 4). The explicandum language can either be taken to be the everyday talk about whole-number quantitative issues or even the scientific mathematical jargon of Frege’s time. His reference to “arithmetic” seems to suggest the latter, but several of his examples stem from everyday life.

As mentioned above, ambiguity is a reason for Frege to start the explicative enterprise. When associating ‘one’ or the article ‘a’ with the partial synonym ‘unit’ or ‘unity’, he struggles with its ambiguity for several sections of the book (§§29-39). Right from the beginning of dealing with ‘unit’, Frege is skeptical of that term and rejects it as a possible explicandum. The syntactic ambiguity of ‘ein’ in the German language plays a role as well. (It can be a numeral or an indefinite article.) With regard to the ambiguity of ‘number’, Frege restricts his investigation to non-negative integers in §2, excluding other types of numbers.

Frege does not review systematic empirical research on the use of the expressions ‘one’ or ‘number’, but he gives several examples from ordinary language which he regards as representing widely shared use patterns (e.g. §52). (This is common in philosophical explications, since detailed linguistic studies for the relevant expressions are rarely available.) Frege also refrains from identifying a definite body of texts including relevant uses of the explicanda, although part II of the book is labeled “Views of certain writers on the concept of Number” which can be understood as an incidental definition of a canon of reference texts. In any case, it seems plausible to take Frege’s consideration of some proposals by other authors as an enquiry into preceding explications. Besides his repeated criticism, he positively incorporates a few aspects from earlier explications, for example Leibniz’s definition of the number two and beyond (§§6, 55). Strictly speaking, this does not concern the explicanda identified (‘number’, ‘one’) but related expressions (‘two’, ‘three’ …). When discussing the definability of ‘number’ in general, Frege recognizes the failures of some earlier attempts (§20). This can be seen as another recognition of the explicative history.

Writing before the dawn of metamathematics and thus metalanguage while also taking a leave from his own formal system (Begriffsschrift), it may be understandable that there is no section in Frege’s book where an explicatum language is explicitly named. It seems plausible to understand Frege’s explication of ‘number’ as not necessarily intended for everyday use, but for mathematical and philosophical contexts. Therefore, the explicatum language may be the scientific mathematical jargon. (In retrospect, one may understand Frege to be employing the language of a formal set theory as the explicatum language, but this defies chronology.) Thus, the scientific mathematical jargon is Frege’s explicandum language and explicatum language. This yields an intra-language explication (3.a). When considered this way, it is to be expected as per Carnap that the explicatum is situated within “a more exact part” of the explicandum language (Carnap 1963, 935). This, of course, depends on the further steps. For example, Frege’s investigation ensures that grammatical categorization is more transparent on the explicatum end. It becomes clear that ‘one’ is not supposed to be taken as a property name for objects (§§29-33). Some semantical issues (“Are units identical with one another?”, §34) are also addressed explicitly. In sum, instead of providing an explicit language and a definition of ‘number’ and ‘one’, Frege creates a “more exact part” within ordinary mathematical (or philosophical) language by answering several questions that pertain to the syntactical and semantical properties of the explicandum. Some properties of the environment in which the explicatum is to be situated is provided by Frege in passing: “I assume that it is known what the extension of a concept is.” (§68, fn.)

Since this process gradually transforms the original (explicandum) setting into the setting for the explicative introduction, the explicatum is not named explicitly, except in the definitions that are viewed as the explicative introductions (§§72, 77). The explicata are ‘… is a Number’ and ‘1’, a unary predicate constant and an individual constant in modern parlance. To avoid the ambiguous picture in which not only explicandum language and explicatum language are identical, but explicandum and explicatum as well, one may read the explicata with a suppressed index: ‘1F’.

Several definitions that are strictly part of the individuation of the explicatum language (or theory) lead up to and connect the two specific acts of explicative introduction. They are both acts of definition: “n is a Number” is to mean the same as the expression “there exists a concept such that n is the Number which belongs to it”” (§72). “1 is the Number which belongs to the concept ‘identical with 0’.” (§77) The metalinguistic character of these definitions can be discounted as they can be both easily rephrased without referring to expressions. Frege plausibly employed the quotation marks as a parsing device. The first explicative introduction introduces ‘… is a Number’ by employing the unary function constant ‘the number of …’. The second explicative introduction introducing ‘1’ also relies on the expressions ‘the number of …’ and on ‘0’. ‘0’ is introduced in §74. The definition of ‘the number of …’ relies on ‘the extension of …’—a base expression—and the predicate ‘… is equal to …’ (“gleichzahlig”), which is introduced in §72 as well. In the preceding illustrations, a segment of Frege’s definitional tree is sketched. If ‘the number of …’ and ‘0’ were included among the explicata, then at least two chains of explications (3.a) could be seen here: A1. ‘the number of …’ along with A2. ‘… is a number’ comprises one chain; and the other chain is comprised of B1. ‘the number of …’, B2. ‘0’, and B3. ‘1’. Following the explicative introductions, Frege defines further expressions, like ‘…follows in the series of natural number directly after …’ (§76).

Even before the number definition in §72, Frege launches efforts to assess the adequacy of his explication by preparing the verification of criteria of adequacy. He does this consciously, as can be seen from the headline, “Our definition completed and its worth proved”, and the consecutive section which mentions what was to become Carnap’s generic criterion of fruitfulness (§70). The specific criteria of explicative adequacy are not explicitly formulated before the explicative introduction. In fact, they are not presented until he actually verifies some of them, leaving the others without proof. This order of things is methodically not ideal because it gives the criteria an ad-hoc character. However, explicators proceed in this fashion frequently, possibly due to dramaturgical reasons. Among the specific criteria of adequacy are some theorems about equality which appear in the definiens for ‘1’. Frege comes closest to giving criteria of adequacy in §78 when he lists six theorems employing several of the expressions he defines and not only the two explicata. His motions toward full arguments supporting these theorems feature several more definitions and further theorems. Thus, Frege’s method of assessing the adequacy of his explications is not in full accord with the method of explication outlined in sect. 2. It would be better described as an argument for adequacy by use, as in the use of the explicata in a way they are usually used in arithmetic. With regard to certain theorems, Frege presupposes their truth by qualifying them as analytic in the concluding sections (§§87, 109).

5. References and Further Reading

  • Aristotle, 1984, The Complete Works of Aristotle, J. Barnes (ed.), Princeton: Princeton University Press.
  • Audi, P., 2015, “Explanation and Explication,” in The Palgrave Handbook of Philosophical Methods, C. Daly (ed.), Houndsmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 208-230.
  • Beaney, M., 2004, “Carnap’s Conception of Explication: From Frege to Husserl?,” in Carnap Brought Home: The View from Jena, S. Awodey, and C. Klein (eds.), Chicago and LaSalle, IL: Open Court, pp. 117-150.
  • Beaney, M., 2014, “Analysis,” in Stanford Encyclopedia of Philosophy, E. N. Zalta (ed.), Stanford: Stanford University. URL: https://plato.stanford.edu/entries/analysis/
  • Blanchette, P. A., 2012, Frege’s Conception of Logic, Oxford: OUP.
  • Boniolo, G., 2003, “Kant’s Explication and Carnap’s Explication: The Redde Rationem,” International Philosophical Quarterly, 43 (1): 289-298.
  • Brun, G., 2016, “Explication as a Method of Conceptual Re-engineering,” Erkenntnis, 81 (6): 1211-1241.
  • Brun, G., 2017, “Conceptual re-engineering: from explication to reflective equilibrium,” Synthese. DOI: https://doi.org/10.1007/s11229-017-1596-4
  • Carnap, R., 1947, Meaning and Necessity: A Study in Semantics and Modal Logic, Chicago and London: The University of Chicago Press.
  • Carnap, R., 1950, Logical Foundations of Probability, Chicago: The University of Chicago Press.
  • Carnap, R., 1963, “Replies and Systematic Expositions,” in The Philosophy of Rudolf Carnap, P. A. Schilpp (ed.), LaSalle, IL: Open Court, pp. 859-1013.
  • Carnap, R., 2003 [1928], The Logical Structure of the World and Pseudoproblems in Philosophy, R. A. George (transl.), Chicago, Ill.: Open Court.
  • Carus, A. W., 2007, Carnap and Twentieth-Century Thought: Explication as Enlightenment, Cambridge et al.: CUP.
  • Carus, A. W., 2012, “Engineers and Drifters: The Ideal of Explication and Its Critics,” in Carnap’s Ideal of Explication and Naturalism, P. Wagner (ed.), Houndsmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 225-239.
  • Cohnitz, D., and M. Rossberg, 2006, Nelson Goodman, Chesham: Acumen.
  • Cordes, M., 2016, Scheinprobleme. Ein explikativer Versuch, dissertation, University of Greifswald. URL: https://nbn-resolving.org/urn:nbn:de:gbv:9-002497-0.
  • Cordes, M., 2017, “The constituents of an explication,” Synthese. DOI: https://doi.org/10.1007/s11229-017-1615-5.
  • Creath, R., 2012, “Before Explication,” in Carnap’s Ideal of Explication and Naturalism, P. Wagner (ed.), Houndsmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 161-174.
  • Dutilh Novaes, C., and E. Reck, 2017, “Carnapian explication, formalisms as cognitive tools and the paradox of adequate formalization,” Synthese 194 (1): 195-215.
  • Floyd, J., 2012, “Wittgenstein, Carnap, and Turing: Contrasting Notions of Analysis,” in Carnap’s Ideal of Explication and Naturalism, P. Wagner (ed.), Houndsmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 34-46.
  • Frege, G., 1953 [1884], Die Grundlagen der Arithmetik – Eine logisch mathematische Untersuchung über den Begriff der Zahl/The Foundations of Arithmetic – A logico-mathematical enquiry into the concept of number, J. L. Austin (transl.), Oxford: Blackwell.
  • Frege, G., 1969 [1914], “Logik in der Mathematik,” in Gottlob Frege: Nachgelassene Schriften, H. Hermes, F. Kambartel, and F. Kaulbach (eds.), Hamburg: Meiner, pp. 218-270. (English translation: Frege, G., 1979, “Logic in Mathematics,” in Gottlob Frege: Posthumous Writings, H. Hermes, F. Kambartel, and F. Kaulbach (eds.), P. Long, and R. White (transl.), Oxford: Blackwell, pp. 203-250.)
  • Goodman, N., 1966 [1951], The Structure of Appearance, Indianapolis/New York/Kansas City: Bobbs-Merrill.
  • Greimann, D., 2007, “Regeln für das korrekte Explizieren von Begriffen,” Zeitschrift für philosophische Forschung, 61 (3): 261-282.
  • Hahn, S., 2013, Rationalität – eine Kartierung, Münster: Mentis. (English edition in preparation.)
  • Hanna, J. F., 1968, “An Explication of ‘Explication’,” Philosophy of Science, 35 (1): 28-44.
  • Haslanger, S., 2012, Resisting Reality: Social Construction and Social Critique, Oxford: OUP.
  • Hempel, C. G., 1952, Fundamentals of Concept Formation in Empirical Science, Vol. II, No. 7 of International Encyclopedia of Unified Science, O. Neurath, R. Carnap, and C. Morris (eds.), Chicago and London: The University of Chicago Press.
  • Hodges, W., 2014, “Tarki’s Truth Definitions,” in Stanford Encyclopedia of Philosophy, E. N. Zalta (ed.), Stanford: Stanford University. URL: https://plato.stanford.edu/entries/tarski-truth/
  • IAU (International Astronomical Union), 2006, “IAU 2006 General Assembly: Result of the IAU Resolution votes,” Prague. URL: https://www.iau.org/news/pressreleases/detail/iau0603/
  • Ibarra, A., and T. Mormann, 1992, “L’explication en tant que généralisation théorique,” Dialectica, 46 (2): 151-168.
  • Justus, J., 2012, “Carnap on concept determination: methodology for philosophy of science,” European Journal for Philosophy of Science, 2: 161-179.
  • Kant, I., 1998 [1781], Critique of Pure Reason, P. Guyer, and A. W. Wood (transl./eds.), Cambridge et al.: CUP.
  • Lambert, J. H., 1771, Anlage zur Architectonic oder Theorie des Einfachen und des Ersten in der philosophischen und mathematischen Erkenntniß, Riga: Hartknoch.
  • Langford, C. H., 1942, “The Notion of Analysis in Moore’s Philosophy,” in The Philosophy of G. E. Moore, P. A. Schilpp (ed.), Evanston/Chicago: Northwestern University, pp. 319-342.
  • Lavers, G., 2013, “Frege, Carnap, and Explication: ‘Our Concern Here Is to Arrive at a Concept of Number Usable for the Purpose of Science’,” History and Philosophy of Logic, 34 (3): 225-241.
  • Locke, J., 1997 [1689], An Essay Concerning Human Understanding, London et al.: Penguin.
  • Löffler, W., 2008, Einführung in die Logik, Stuttgart: W. Kohlhammer.
  • Maher, P., 2007, “Explication Defended,” Studia Logica, 86: 331-341.
  • Martin, M., 1973, “The Explication of a Theory,” Philosophia, 3 (2-3): 179-199.
  • Menger, K., 1943, “What is Dimension?” The American Mathematical Monthly, 50: 2-7.
  • Murzi, M., 2007, “Changes in a scientific concept: what is a planet?” PhilSci Archive. URL: http://philsci-archive.pitt.edu/id/eprint/3418
  • Naess, A., 1953, Interpretation and Preciseness: A Contribution to the Theory of Communication, Oslo: Dybwad.
  • Olsson, E. J., 2015, “Gettier and the method of explication: a 60 year old solution to a 50 year old problem,” Philosophical Studies, 172 (1): 57-72.
  • Pehar, D., 2001, “Use of Ambiguities in Peace Agreements,” in Language and Diplomacy, J. Kurbalija, and H. Slavik (eds.), Malta: DiploProjects, pp. 163-200.
  • Pinder, M., 2017a, “Does Experimental Philosophy Have a Role to Play in Carnapian Explication?” Ratio, 30 (4): 443-461.
  • Pinder, M., 2017b, “On Strawson’s critique of explication as a method in philosophy,” Synthese. DOI: https://doi.org/10.1007/s11229-017-1614-6
  • Prior, A. N., 1960, “The Runabout Inference-Ticket,” Analysis, 21 (2): 38-39.
  • Quine, W. V. O., 1951, “Two Dogmas of Empiricism,” The Philosophical Review, 60 (1): 20-43.
  • Quine, W. V. O., 1960, Word and Object, New York/London: The Technology Press (MIT)/John Wiley & Sons.
  • Radnitzky, G., 1989, “Explikation,“ in Handlexikon zur Wissenschaftstheorie, H. Seiffert, and G. Radnitzky (eds.), München: Ehrenwirth, pp. 73-80.
  • Reck, E., 2012, “Carnapian Explication: A Case Study and Critique,” in Carnap’s Ideal of Explication and Naturalism, P. Wagner (ed.), Houndsmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 96-116.
  • Reichenbach, H., 1951, “The Verifiability Theory of Meaning,” Proceedings of the American Academy of Arts and Sciences, 80 (1): 46-60.
  • Schupbach, J. N., 2017, “Experimental Explication,” Philosophy and Phenomenological Research, 94 (3): 672-710.
  • Shepherd, J., and J. Justus, 2015, “X-Phi and Carnapian Explication,” Erkenntnis, 80 (2): 381-402.
  • Siegwart, G., 1997a, “Explikation. Ein methodologischer Versuch,” in Dialog und System: Otto Muck zum 65. Geburtstag, W. Löffler, and E. Runggaldier (eds.), Sankt Augustin: Academia, pp. 15-45.
  • Siegwart, G., 1997b, Vorfragen zur Wahrheit: Ein Traktat über kognitive Sprachen, München: Oldenbourg.
  • Siegwart, G., 2007a, “Johann Heinrich Lambert und die präexplikativen Methoden,” Philosophisches Jahrbuch, 114 (1): 95-116.
  • Siegwart, G., 2007b, “Alethic acts and alethiological reflection. An outline of a constructive philosophy of truth,” in Truth and Speech Acts. Studies in the philosophy of language, D. Greimann, and G. Siegwart (eds.), New York: Routledge, pp. 41-58.
  • Stein, H., 1992, “Was Carnap Entirely Wrong, after All?” Synthese, 93 (1/2): 275-295.
  • Strawson, P. F., 1963, “Carnap’s Views on Constructed Systems versus Natural Languages in Analytic Philosophy,” in The Philosophy of Rudolf Carnap, P. A. Schilpp (ed.), LaSalle, IL: Open Court, pp. 503-518.
  • Strawson, P. F., 1992, Analysis and Metaphysics: An Introduction to Philosophy, Oxford: OUP.
  • Suppes, P., 1957, Introduction to Logic, Mineola, New York: Dover.
  • Tarski, A., 1944, “The semantic conception of Truth and the Foundations of Semantics,” Philosophy and Phenomenological Research, 4: 341-376.
  • Tarski, A., 1956, “The Concept of Truth in Formalized Languages,” in Logic, Semantics, Metamathematics. Papers from 1923 to 1938, A. Tarski, Oxford: Clarendon Press, pp. 152-278.
  • Tarski, A., 1969, “Truth and Proof,” Scientific American, 220 (6): 63-77.
  • Tillman, F. A., 1965, “Explication and Ordinary Language Analysis,” Philosophy and Phenomenological Research, 25 (3): 375-383.
  • Tillman, F. A., 1967, “Linguistic Portrayal and Theoretical Involvement,” Philosophy and Phenomenological Research, 27 (4): 597-605.
  • Wilson, M., 2006, Wandering Significance: An Essay on Conceptual Behavior, Oxford: Clarendon Press.

 

Author Information

Moritz Cordes
Email: cordesm@uni-greifswald.de
University of Greifswald
Germany

and

Geo Siegwart
Email: siegwart@uni-greifswald.de
University of Greifswald
Germany

Philosophy of Architecture

photo of SFMOMA by David Ohmer

The relation between philosophy and architecture is interrogative and propositional. It is about asking questions concerning the meaning of human habitation—what it means to live in built environs—and about evaluating plans and design projects where human flourishing and social progress can best occur—in what kinds of buildings, interior spaces and urban precincts. The following sets of questions address issues—aesthetic, ethical, and political issues, as well as metaphysical and epistemological concerns—that relate philosophy to architecture. Although philosophers and architectural theorists (and often design practitioners) can each be expected to have an interest in any or all of these questions, as scholars or public intellectuals of a kind, architectural theorists have played as much, if not more, of a role in shaping the field than philosophers have. There are historical reasons for this, having much to do with the origins and evolution of different academic disciplines and critical perspectives: the questions likely to be posed by one or the other, for a given period (or perennially in some cases) and the people most concerned to ask them. Here are the questions:

  • What is the philosophy of architecture about? How is, how can, and how should philosophy be connected to architecture?
  • How and in what ways is architecture concerned with aesthetics? How and in what ways is architecture concerned with ethics? Is there a connection?
  • What are architecture’s relations to social and political concerns and what does this tell us about the knowledge and discipline of architecture?

The focus of the article is on aesthetic and ethical issues which are, on virtually all accounts, the mainstay of philosophy of architecture. A consideration of ethical issues in architecture in relation to the aesthetic ones quickly segues into architecture’s relation to social theory and political philosophy.

Table of Contents

  1. Architecture: Discourse and Practice
  2. Key philosophical Issues
    1. Architecture: What is it?
      1. Form and Function
      2. Affective Form (Function Follows Form)
      3. Architecture as Means to Social Engineering
    2. Architecture and Aesthetics
    3. Architecture and Ethics
  3. Philosophical Movements and Ideas in Architecture
    1. Idealism and Architectural History
    2. Phenomenology and Architectural Experience
    3. Structuralism and Meaning
      1. Postmodernism
  4. Post-structuralism and Power
  5. Selected Lines of Inquiry into Philosophy of Architecture
    1. Architecture and Representation
    2. Architectural Value and Heritage
    3. Environmental Issues
    4. Design Pedagogy
  6. References and Further Reading

1. Architecture: Discourse and Practice

The mixed character of architecture comes from it being a subject of overlapping philosophical and theoretical discourses as well as a category of creative practices. Philosophy of architecture has long been associated first and foremost with aesthetics. While architecture may be an art form, it is not a branch of aesthetics. In fact (or instead), a case can be made for relocating architecture, as philosophically considered, primarily, though by no means exclusively, within ethics and social and political philosophy. From a philosophical perspective, Winter’s (2001) survey article “Architecture” appears to be located where it belongs, in The Routledge Companion to Aesthetics rather than a volume on social and political thought or ethics. But from an architectural theorist’s point of view, or the viewpoint of a geographer or city planner, classifying architecture as a topic within aesthetics may seem too narrow or conforming. Of course, treating architecture within aesthetics does not preclude consideration of it from other points of view and it would be equally legitimate to have entries on architecture in companions to social philosophy or evolutionary psychology. Nevertheless, from these other points of view, placing the philosophy of architecture primarily in aesthetics is misleading. A survey of theory journals like Assemblage, Grey Room, or Architectural Theory Review shows that philosophy remains an important source of ideas and validation, but references to philosophical aesthetics are limited. The philosophical study of architecture raises questions in several philosophical sub-disciplines, and many design practitioners show a greater interest in ethical and political issues than aesthetic ones.

The question of where to place the philosophy of architecture is not simply a matter of preference or a topographical question of little importance. Instead, the issue can be linked to architectural practice in ways that enlarge our conceptions of architecture, architectural practice, and the architect. In any case, the question of where to locate the philosophy of architecture highlights significant differences between architectural theorists and philosophers, as to how to conceptualize, analyze, and address a range of topics of common concern.

While philosophers of art and aesthetics are still more likely to consider architecture than are social philosophers and ethicists, architectural theorists see the connection with ethics and social and political issues as more relevant and important. However, by considering the concerns of both philosophy and architectural theory, philosophy of architecture enlarges its conceptual and critical domain in ways that impact both philosophical and architectural theoretical approaches to architecture.

2. Key philosophical Issues

 Architecture can and has been conceived as an intrinsically philosophical enterprise—grounded in aesthetics and ethics (incluing theories of human nature)—and also in elements of social and political philosophy. Architects, landscape architects, and designers are responsible for creating spaces and fashioning the world (materially and ideationally) in which people live and interact. In so doing they promote as well as undermine certain values, understandings, and ways of living.

One need not cite utopian characterizations of “the City” to make the point that architecture is concerned with material realizations of visions of the good and what it means to live well. Urban culture manifests itself in its architecture. Debates over the future and planning of the City, including schemes that either rehabilitate or disavow utopian traditions, reinforce this important role. Although Ballantyne and Winters both discuss the aesthetic evaluation of architecture at length, they both believe that “we should evaluate buildings according to how well they make possible desired forms of life” (Goldblatt and Paden 2011, 4). Thus, the final standard of architectural value for them is the ethical (Ballantyne 2011; Winters 2011).

Adopting a historical perspective for these concerns, in a 1949 editorial for the Architectural Review announcing a new feature of the review called the “Canon,” the editors (including Nikolaus Pevsner) bemoaned the fact that architectural theorists had long lacked the common source material “necessary for the construction of the framework within which any who are so disposed may begin to discuss the theory and philosophy of architecture.” They stressed the need for a theoretically informed architectural criticism, regarding it as essential to understanding and improving the then current state of architecture and the architectural profession in Britain.

The perceived absence of any such canon or framework for a theory and philosophy of architecture, may have undermined or hindered informed architectural criticism in Pevsner’s mind, but it has not stopped speculation on the nature of the relation between philosophy and architecture—including accounts that are quite general and self-aggrandizing. Mueller (1960, 39-43) described the relations as follows.

Both philosophy and architecture are edifying … [They] make possible all other values of life or all other arts … Architecture is their spatial, philosophy their spiritual home. In one and the same act, philosophy and architecture enclose man in their shell and structure, and disclose open vistas, new horizons, spiritual possibilities of expansion and self-realization … architecture [expresses] an underlying world-view, a cultural whole, the spirit of an epoch or a people, a dominant value of life—all transcriptions of philosophy… Philosophy and architecture have the coming task of healing the split of knowledge and feeling, of individual and community…

One source of Sigfried Giedion’s often quoted but obscure claim that the main task of architecture is “the interpretation of a way of life” valid for the times (1974, xxxiii) can be found in Immanuel Kant. Although Kant (“Analytic of the Beautiful” in The Critique of Judgment (1790)) claimed that function limits the potential beauty of buildings, he also claimed that beautiful art has “‘spirit’ by means of which ‘aesthetic ideas’ can be expressed” (Goldblatt and Paden 2011, 3). Presumably, it is this “expressive” dimension of architecture that Giedion has in mind. It is however one thing to say that architecture has an expressive dimension and another to suggest that a building is capable of expressing the spirit of an age.

Modern philosophical discourse on architectural practice can be traced to Kant, John Ruskin (1849), more directly to Martin Heidegger (1951), Pevsner (1936), Giedion (1974), and more recently to Roger Scruton (1979) and Karsten Harries (1997). (See Haldane 1999.) In the 20th century, the discourse has focused on architecture’s seeking to articulate its identity and special relevance by embodying or expressing the social, political, economic and personal character of the times. Other figures, from the philosophical side, include Warwick Fox (2000), Malcolm Budd (1995), Christine Smith (1992), and Tom Spector (2001). Among many widely discussed prominent architectural theorists and practitioners who have contributed to a philosophy of architecture are Vitruvius (First century Roman authority), Le Corbusier, Adolf Loos, Robert Venturi, Bernard Tschumi, Frank Gehry, and Daniel Libeskind.

a. Architecture: What is it?

The key underlying question for all of the preceding philosophers and architectural theorists and practitioners is “What is the philosophy of architecture about?” At the same time, we can also turn to ask “How is, how can, and how should philosophy be connected to architecture?” and both of these beg the question “What is architecture?”

The question “What is architecture?” has commonly focused first on how architecture is to be distinguished from “mere” building; and second, on its relation to art. Answers have often depended on how one has sought to reconcile or prioritize Vitruvius’s three elements of architecture: firmitas (durability, firmness), utilitas (convenience, commodity, practicality, function), and venustas (beauty, delight). Given the Vitruvian perspective, it is questionable whether these three elements could ever allow one to surmise, let alone deduce, fundamental architectural principles governing every significant work. It is doubtful that such principles could ever be extrapolated from observations of the material, functional or aesthetic attributes of a building—its “durability”, “convenience” or “beauty.” In any case, Vitruvius’s The Ten Books of Architecture (c. 15 B.C.E.), has been the most common source employed by architectural theorists and philosophers concerned with articulating the nature of architecture. (Spector (2001) structures his book around Vitruvius’s three elements.)

Whether, or to what extent, architecture is to be regarded as an art has long been disputed. One could perhaps just conceive of architecture as a craft pertaining to the fashioning of useful buildings. Nevertheless, Vitruvius, like most architectural theorists, sees the aesthetic (venustas) as essential to architecture. However, if architecture is an art, it is unusual and perhaps unique in that, unlike other art (music, sculpture, visual arts), utilitas (function) is also regarded as an essential part of architecture. While music, drama and the other arts, may serve many functions, including those that are purely aesthetic, and are “practical” in various ways, none of these functions are regarded as intrinsic to their status as an art or to their ontological status—that is, to what they are.

Functionality (utility, purpose, practicality, and so forth) is however, necessary to architecture. Graham (1989, 249) says “[A]esthetic functions in music and painting [and so forth] can be abandoned without loss to their essential character as worthwhile objects of aesthetic attention … But the same cannot be said for architecture. A building which fails in the purpose for which it is built [no matter how aesthetically pleasing] is an architectural failure, whatever other merits it may have.” Even those modernists who see form as preeminent and imply that the inverse of the adage holds true (that function must follow form) may agree with Graham on this. An aesthetically designed material arrangement that has no function or extra-aesthetic purpose may be regarded as a sculpture, and possibly even a “building.” However, on the accounts stemming in theory from Vitruvius, apart from it having some (not wholly aesthetic) function (utilitas), no matter how changeable (the functions of buildings may of course change) or nondescript, it cannot be regarded as architecture.

Given that the elements in the Vitruvian triad are all ingredient in architecture to some degree, the central issue concerning the nature of architecture often rests on determining which of these indispensable elements does or should take precedence and why. Given that the aesthetic (or aesthetic concern) is always necessary, though to what extent, and whether dictated primarily by form or function is disputed, the history of architectural theory can be seen as a debate between those who place emphasis on form rather than function or vice-versa. Graham (1989, 252-3) says “The philosophical dispute between the two lines of thought suggested here—function should determine form and form should determine function—is arguably the basis of the history of architecture in the last 120 years. It is around these themes that the differences between architects of the late 19th century and the Modernists are best understood.”

Whereas the arts typically are regarded as non-functionally valuable for their own sakes, Winter is saying that not only is architecture functional, but also that its aesthetic values are integral rather than incidental to its functionality. Lagueux (2004) claims that the link between ethics and aesthetics is intrinsic. Ethical problems are at one and the same time aesthetic problems for the architect. This is one, if not “the” defining feature of architecture.

i. Form and Function

Theory and practice may diverge in addressing this question of precedence. In practice, it is clear that some architecture is concerned primarily with function and the performance of buildings however this is conceived (occupationally, economically or in terms of durability, and so forth), and some largely with form. The symbolic and mnemonic demands placed upon commemorative architecture, for instance, typically result in designs with a strong emphasis on form. Of course, the expectation that a monument or memorial convey memories and ideas about the past also entails a kind of functionalist—that is, performative—reasoning. The two elements are not so easily distinguished.

Theoretically speaking, however, the need for a far closer connection between function and form has frequently been envisioned. Some claim that buildings in which form and function are closely related make for better architecture. Those who argue that they do (Pevsner 1937, 11) see a mismatch between form and function (for example, if a commercial establishment were designed to look like a church) as a deception, a fraud, or perhaps morally uncertain. Graham (1989, 252) claims that while “copying of styles and the extensive use of facades” may in a sense be deceptive, it is not immoral, and yet “other things being equal, [ideally] such deception is better avoided, if it can be.” But why its avoidance would be preferable is unclear, and views on the subject are mixed and have changed over time. According to the 19th century horticulturalist and architectural critic John C. Loudon (1822, 1013), “A barn disguised as a church would afford satisfaction to none but those who considered it as a trick. The beauty of truth is so essential to every other kind of beauty that it can neither be dispensed with in art nor in morals.”

Contrast this sentiment with Venturi’s celebration of the “decorated shed” 150 years later in Learning from Las Vegas (Venturi, Brown & Izenour 1972), or the notoriety granted Gehry with his “Binoculars Building” in Los Angeles and his collaboration with artists Claes Oldenburg and Coosie van Bruggen on the building’s entrance façade. Graham’s (and Loudon’s) view is likely rooted in a normative presupposition about the relation between form and function; that it is better to have them “unified” in some sense. Graham (1989, 252) says “a building which declares its functions openly and yet at the same time succeeds in conveying all those attributes which the use of a façade aimed to do, would be preferable.” But why and in what way it would be preferable remains unclear. Watkin (1984) thinks that this talk of deception is misguided and that in any case, architecturally speaking, there is nothing wrong with such practices.

Louis Sullivan, the architect responsible for developing the architecture of the late 19th century skyscraper in Chicago, is known for the principle—“form follows function”—around which possibly most debate in modern architecture and design has focused. Augustus Pugin, known for his work on the Gothic revival, made much the same point. He argued against architectural features unrelated to the purpose of a building. For Sullivan, the principle was metaphysically grounded—a kind of law of nature that was normative.

“Form follows function” was seen by some as an inviolable principle offering unique design solutions. It is closely associated with modernist architects early in the 20th century and later. Adolf Loos famously denounced building ornament as a crime for it was superfluous to function and consequently immoral. Frank Lloyd Wright, Sullivan’s assistant at one time, also adopted the principle. The debate on just how the principle is to be interpreted and applied, as well as its validity on any interpretation, is ongoing.

Granted that the proposed use of a building will naturally influence its form (its design), the idea that “use” determines the form seems far too prescriptive. After all, many quite disparate forms might equally well serve a building’s function. The idea that any particular architectural design—no matter how fitting—is more or less uniquely dictated by the precise function can be little more than a retrospective and wishful justification for one’s own design choices. The dictum “form should follow function” is more likely to be the case, or be the case to a far greater extent, with engineering or industrial design (for example, a fuel injector or a heart valve) than with architecture. Questions remain. Just how is one to justify a dictum “form should follow function?” Is it a metaphysical or a normative ethical principle and/or principle of aesthetics? Is it some other kind of irreducible architectural principle? Is the dictum’s justification meant to be logical, rational and/or affective, or in some way grounded in experience or phenomenology?

ii. Affective Form (Function Follows Form)

Le Corbusier’s modernist architecture seeks to create, influence, redefine or even determine the functions that architectural shapes and spaces (not simply “buildings”) are used for. For Le Corbusier, architecture is art. However, the idea that form can influence or even determine function and thereby shape human behavior and communities makes the architect more than, and other than, an artist. It makes the architect a social engineer and planner of sorts as well as likely a moralist and visionary (albeit not necessarily a good visionary). As Corbusier conceived it, the architect is, to various degrees, able to control the uses that designed space is put to—how occupants move in such spaces, how they live in them—and perhaps even the kinds of thoughts and inclinations they have as a result of experiencing designed space in certain ways. Needless to say, questions remain regarding the extent to which Corbusier was successful. Not everyone liked the results.

The idea that building form affects its occupants, physically and/or mentally, is not new. Different metaphysical systems have postulated how this works long before the terms of “form” and “function” became part of the modernist vocabulary. For Renaissance humanist architects, for example, relations of resemblance or similitude embedded in neo-Platonic doctrine suggested that church domes be designed to emulate the vault of heaven, while belief in a force of sympathetic attraction accounted for why the eye is drawn upwards upon entering the sanctuary. By comparison, in the early 19th century Loudon and his contemporaries turned to then current theories of “associationism” to explain why buildings should look certain ways. (A barn should look like a barn, a church like an ecclesiastical building—particularly a Gothic one according to the aesthetic predilections of the day.) Arguably, the “form follows function” equation is the more empirical and deterministic, largely behavioral and socially normalizing, successor to associationism.

Le Corbusier was principally concerned with domestic habitation (housing). His architecture was not intended to service preconceived ideas about what such habitation should be, but to create new and as yet undetermined possibilities for living. The modernist realizations of these undetermined possibilities for habitation were often considered failures. Judged “ugly” or not, iconic modernist building types were found to be unpleasant to live or to work in. High-rise blocks of flats and urban housing estates like Pruitt-Igoe in St. Louis (famously dynamited in 1972, only 16 years after its completion) were routinely derided as “eyesores” for their monumental scale and visual monotony and condemned for their crowdedness, exposure to uninvited surveillance and characteristic disrepair. Architectural historian and critic Charles Jencks (1977) famously described the demolition as the day modern architecture officially died.

However, refusing to unthinkingly yield to preconceived and possibly worn-out notions of habitation can hardly be said to lead to the conclusion that a building’s function follows (or should follow) its form in some highly determinate manner. Likewise, a suburban estate clothed in neo-traditional garb, with encircling gardens, front porches and pathways for pedestrian interaction, is no more guaranteed to produce community than a block of council flats, however well-designed either may be. One can read Le Corbusier’s dictum that “the house is a machine for living in” as a useful prompt for thinking about possibilities for and limitations on architecture’s capacity to provide for human wellbeing and social vitality.

The affective capacity of architecture is the driving idea behind modernism—one that still exercises considerable influence. This can be seen particularly in the formally inventive work by Gehry, Zaha Hadid and other of contemporary architecture’s best known designers. However, the idea has had a far more lasting influence in other design disciplines such as city planning. Thus, it is claimed that certain kinds of public spaces enhance democratic (or totalitarian or socialist, and so forth) values while others are constructed to promote alternative sets of values and ways of thinking and behaving. The broad claim is that the function as well as ethos (moods, motivations, ethics) of buildings, parks, neighborhoods or entire cities follow from their design (form).

It is easy to see how design, choice of material and planning (form), may enhance or curtail certain values and functions. But just as in the case of those who claim “form follows function,” only an ideologue would claim that function is determined by, or should be determined by, form and form alone. Innovation of form may influence, expand and otherwise alter our understanding of function in some cases, but this still falls short of strict adherence to the direct causal connection implied by Le Corbusier’s mechanical analogy. A parking lot must have a place to park cars and a laundromat a place where clothes can be cleaned. Whether one is building a home, factory or an airport, the building’s intended function needs to be taken into account. But if such function does not follow from its form alone, then any dogmatic (absolute) interpretation of Le Corbusier’s version of modernism will fail.

Furthermore, there are various kinds of built environments, communities, dwellings, public spaces, and kinds of cities that people enjoy or dislike (often at the same time). None may be intrinsically better than any other when judged by a single criterion, whether that criterion is “form follows function” or vice versa. Some may suit the “well-being” of particular individuals better than others. “Which is better Paris or London…the city or the county?” are questions that have no determinate answer unless one regards them, as one should, as questions about preferences. Certain goals of development and the realization of one kind of community or public space will often preclude others- although they need not.

Graham (1989, 255) says “a style of architecture which satisfies both functional and aesthetic considerations and has a greater unity is intelligible as an ideal, and one to which many generations of architects have aspired.” It is not difficult to see why architecture that satisfies both kinds of considerations in some unified way, and there may be various equally fine ways of doing so, is desirable. To see this as an ideal is to see aesthetics as intrinsic to architecture (and design). But this ideal is incompatible with the ideology on either side of the form versus function debate. Architecture that achieves such unity will have succeeded in embedding its function in its form and expressing it by means of its form. The ability to achieve this in various ways and to various degrees is an essential part of the architect’s (designer’s) capability qua architect.

However, just how and to what extent buildings can convey meaning or ideas regarding function, as opposed to having meaning and ideas attributed to them, is controversial (Whyte 2006; Graham 1989, 256). For instance, Parsons and Carlson (2008) claim that aesthetic judgments of buildings depend directly upon satisfying functional requirements. Nonetheless, even when buildings appear perfectly suited to their purposes (for example, churches, sport stadiums, schools, prisons) they do not entail meaning and value independent of context and associations accrued through time and place. Meaning and value are virtually always contextualized.

iii. Architecture as Means to Social Engineering

Social Engineering (the “possibility of making society”) and physical determinism (influencing or determining human behavior through space) are ideas that preceded Le Corbusier. They have been deeply imbedded in modern design and urban planning from the start (see Lawhon 2009) and they continue to be influential. An important supposition underlying such ideas is captured by David Brain (2005, 233) who says, “In the context of the urban landscape, every design and planning decision is a value proposition, and a proposition that has to do with social and political relationships.” More recently these issues have re-appeared in the debate on “New Urbanism” as well as with self-reflective questions concerning the nature and aspirations of contemporary architecture and planning.

New Urbanism is a movement codified in the Congress for the New Urbanism’s (CNU) charter (Leccese and McCormick 2000) and identified by a set of 27 principles and evaluative ideas about how cities, particularly suburban cities, should be organized. The CNU sees architecture as the means to social engineering, making for genuine community. The appeal to “community” is ubiquitous in contemporary architectural discourse, partly because notions of “community” are often invoked as a justification by practitioners on behalf of favored design practices. New Urbanism is hard to pin down because just which projects meet or fail to satisfy the CNU’s principles is disputed. Many of the movement’s aims are clearly aligned with late 20th century showcase communities like “Seaside” and the Disney Corporation’s “Celebration” in Florida. The term has been applied retrospectively to the post World War II planned suburban community “Levittown” and additional developments in Pennsylvania and New York built in the late 1940’s and 50’s. New Urbanism aims to provide an alternative, a remedy to suburban sprawl and urban decay and bring about much needed social and political change through design and planning. In this regard it shares features with the anti-sprawl “Smart Growth” planning movement.

The CNU charter proposes that cities and towns should “bring into proximity a broad spectrum of public and private uses to support a regional economy that benefits people of all incomes” (principle 7). Proximity requires that “many of the activities of daily living should occur within walking distance, allowing independence to those who do not drive, especially the elderly and the young” (12). The movement’s followers promise to design civic buildings and public gathering places that “reinforce community identity and the culture of democracy” (25). Such principles demonstrate New Urbanism’s self-conscious concern to bring urban planning into line with certain ethical (including social and political) standards and values. These are the values that its charter delineates as consonant with what democracy, social justice, and more generally “human flourishing” require in a contemporary urban environment. This objective calls to mind Giedion’s and Harries’s view of the architect as, in equal parts: social visionary, political provocateur, and savior. It sees the principle task of the architect or planner as one of interpreting and helping to build, in Giedion’s terms, “a way of life valid for our time.” More pointedly, New Urbanism illustrates Lagueux’s (2004) contention that architecture and ethics are indissolubly joined.

This raises the question as to whether these aims can be realized unless democratic institutions and a foundation of social and economic justice are, to a degree, already operative in the spheres where decisions about development take place. Without such a foundation, development remains in the hands of the kind of “conventional development regime” that Brain (2006, 18-19) cites as (partly) responsible for urban sprawl and inner-city decay in the first place. This regime is constituted by “an interlocking system of financing formulas, measures of market feasibility, product types, zoning categories, environmental impact assessments, and routinized planning practices” that make it virtually impossible to undertake projects “that don’t fit standardized categories.”

Those theorizing the nature of urban development often insist that planning be used in ways that enhance and shape the democratic character of the city. But this is not an easy thing to define, as if democratic values were simply decided by consensus—doubtful, given the possibility the “tyranny of the majority” is always a possibility. Moreover, how can the manipulation of the physical environment through architecture contribute to the inculcation of democratic character and values? This has been the most significant question for urban planning since its inception. It is a question only partly about technique.

Physical determinism addresses an important aspect of social engineering in relation to planning in asking how, and to what extent, certain values and ways of life (human behavior) can be inculcated though the physical (planned) environment. In city planning and urban design, physical determinism is underscored by belief that human behavior is determined by environment. It “implies that the design influences residents’ behavior according to some pattern desired by the designer” (Lawhon 2009, 14).

Sociologist Herbert Gans describes the “fallacy of physical determinism,” which addresses a central aspect of “rational goals-means determination” in questioning the extent to which goals made explicit in the design process can be realized by the physical instantiation of a design. He explains (1968, vii) that “Planning is a method of public decision-making which emphasizes explicit goal-choice and rational goals-means determination, so that decisions can be based on the goals people are seeking and on the most effective programs to achieve them.” However, problems arise when there is the lack of explicit goal-choice, or when some superficially well-defined goal-choices turn out to be nebulous. Insofar as goal-choices involve evaluative and interpretive concepts (for example, “democratic values” or “security”)—concepts whose meaning vary widely relative to ethnic, economic, political, religious and other social groups—what may seem like clear choices may gloss over deep divisions.

The fallacy of physical determinism is meant to question the link between physical design concepts and social outcomes. Gans argues (1968), for example, “that the social homogeneity [race and income] of residential areas based on the neighborhood unit was the chief reason for the success of these neighborhoods and that physical determinism was not a chief determining factor in how successful neighborhoods actually were in forming cohesive, stable units” (Lawhon 2009, 13). But Gans’ criticism, narrowly interpreted, is a broadside that misses the significant claim (that geo-spatial environment does affect behavior) by attacking its haplessly over-generalized cousin (geo-spatial environment determines behavior in a quasi-metaphysical sense of determinism). A severe economic recession can bring down even the most successful and socially homogenous residential area, and no amount of urban planning is going to bring about happily integrated communities of different socio-economic and racial backgrounds, with the kinds of race and class divisions that existed in many cities in the near past and that still largely exist in most cities.

Given that very few design professionals hold the kind of strong determinism that the fallacy of physical determinism applies to, it is not this aspect of planning that needs to be queried. The challenges are instead twofold. First, it is the notion of “desired social outcomes” that requires and has received attention. But to reiterate, this part of the problem, that of articulating an adequate and justifiable goal-choice, is an issue that is not wholly, not even primarily, architectural. It is ethical. Jane Jacobs in her classic book and criticism of 1950’s style rationalist planning, The Life and Death of Great American Cities (1961), understood this, and it was this that enabled her to redefine the relation between physical design concepts and desired social outcomes.

The second challenge is fundamentally design based. Given that behavior, and thought, is affected by environment, the question for planning professionals is how to construct an environment in ways that help effect (not determine) behavior: that help to inculcate desirable values (for example, democratic and other social values), and that are also responsive to the values, at least some of the values, of the inhabitants. The question, at the center of modern architectural planning, remains “what is the nature and character of a range of likely effects (plural) and how do these engage with ethics?” The notion of community has had a contested part to play in New Urbanism, but so too have other central social and political ideas that relocate the principal focus of architecture in relation to philosophy from aesthetics to social and political philosophy and ethics. The Aristotelian notion of what it is to “live well” (human-flourishing) has become closely connected to questions about how the built environment can either enhance or detract from a virtuous and otherwise “good” life (compare Ballantyne 2011 and Winters 2011). From this perspective, the goal and “art” of architects and other design professionals is to enhance the “good” life by adhering to established design principles, while also inventively suggesting ever “better” ways of living.

This heightened connection between philosophy and architecture—practical as well as theoretical—involves both an enlargement and reconfiguration of what we take architects, design professionals, and even engineers, to be and to do. Architects in particular, most noticeably the icons of 20th century architecture, have embraced and promoted themselves not only as arbiters and promulgators of taste (an aesthetic function), but also of value: as visionaries capable of addressing fundamental social and political issues, even spiritual ones (for example, national identity and aspirations) through innovative design in ways that others are simply unable to do—an ethical function bordering at times on the salvific.

Seen from the perspective of the design professional, living up to such a new understanding of their role may seem daunting. Design professionals are first and foremost just that. They are not, on the face of it, ethicists, nor does it seem that they need to be politically active or concerned in their own right, in order to conduct their professional lives. The issue then is whether it only their expertise as “technicians of space” that is required, or whether “architecture” now also implies that practitioners engage with places as designers and citizens—both with a broad understanding of ethics, social philosophy, and so forth.

b. Architecture and Aesthetics

The principal, though not sole question concerning architecture in relation to aesthetics is whether architecture, or at least some architecture, is art. Granted that at least some architecture is art, then issues relating to the connection between architecture as an art form and ethics can and have been raised. Likewise, architecture’s concern with ethics is highlighted when asking “Is architecture an art form?”

The question seems to be of more concern to those interested in philosophical aesthetics than to either architects or architectural theorists. Nevertheless, it is central to the philosophy of architecture. Just how the aesthetician or architectural theorist responds to the question is determined by their particular accounts of what a work of art is, or their ontology of art—if they have one. If an artwork is characterized as necessarily non-functional, then there would be reason to exclude virtually all works of architecture as objects of art.

Still, one can deny that architectural objects are objects of art while maintaining that there is or should be an aesthetic dimension to architectural objects. Architecture can be judged on aesthetic grounds in accordance with aesthetic standards of one kind or another—though arguably they either cannot be or should not be judged on aesthetic grounds alone—without thereby being regarded as art objects proper. It is pointless however to deny that some buildings are “beautiful” or that they may engender an aesthetic experience, leaving aside how such an experience is to be understood.

Some architects may regard architecture as an art form. But for those that do, the reason has less to do with a preconceived idea of the nature or ontology of art, than with understanding such an assignation as honorific in some sense. If architectural objects can be art objects, then architects must be artists, along with demonstrating whatever else—technical skill for example—that may be involved in being an architect.

From the perspective of architecture as an applied practice rather than the philosophy of architecture and aesthetics as scholarly disciplines, the question of whether architecture is an art form, and buildings objects of art, could be seen as resting on an ambivalence between art or being an artist on the one hand, and being “artful” and showing due concern for enhancing the aesthetic aspect of the built environment on the other. The O.E.D. defines “artful” as “Displaying or characterized by technical skill,” or “That [which] has practical, operative, or constructive skill; dexterous, clever.” Thus, not only can surgeons and architects be artful, but so too can cooks, car mechanics and thieves.

At times, when a level of “artfulness” displayed is of a very high or remarkable standard, it might be and sometimes is said, that the product is a work of art. Julia Child was an artist in the kitchen, much in the same way that skillful and inventive surgeons might be “artists” in the operating room, teachers in the classroom, and certainly hairdressers in the salon, and so forth. And, although there is undoubtedly an aesthetic dimension in cooking (as well as an architectural dimension if a kind of structure (form) or performative value (a function) is manifest by a certain dish), the aesthetic appears to play a particularly essential role, at least as a desideratum, in architecture.

In his essay “Is Architecture Art?” Davies (1994, 37) never questions whether buildings can be artworks. He says, “it seems obvious that many works are uncontroversial both in being buildings and works of art.” The issue, however, is not whether they are so acclaimed but whether they should be. Those who proclaim such buildings as artworks do not necessarily rely on some articulated and defended notion of art. Their acclaims appear to be largely honorific; another way of saying that such buildings are beautiful and remarkable. It doesn’t necessarily follow that everything that is beautiful is a work of art. Davies’ concern is rather with what kind of artworks they might be, with their resemblance to some kinds of artworks but not others, with the role of the designer’s intentions in determining artfulness, and with technical virtuosity, site, and culturally specific contexts underpinning their status as art objects. He asks: are buildings that are artworks “singular as are hewn sculptures, or instead admit of multiple instances (as do cast bronzes, novels, symphonies, and the like” (1994, 43)?

The claims Davies sees as uncontroversial—that architecture (buildings) may be art; that some specific buildings are artworks, and that some architects (and then only sometimes) are artists—others maintain are confused or mistaken. The view is that such claims carelessly, albeit at times on theoretical grounds, conflate the aesthetic dimension of architecture for art. The last claim is mistaken in particular for seeing architects not as “artful” practitioners, but as artists. Architects may artfully design buildings and houses that enhance the lifestyles and values of their occupants or even suggest new and alternative ones. They may design spaces that promote democratic values, sociability and neighborliness, and workplaces that are particularly well-suited to the specific needs of workers. But in so doing they are practicing architecture—applying their skills—rather than functioning as artists. But, even where aesthetic concerns are predominant, it may just be a way of talking to call their products works of art.

c. Architecture and Ethics

Architecture’s concern with ethics is perhaps more clearly highlighted when asking about its relations to social and political concerns. What does this tell us about the knowledge and discipline of architecture?

Works of architecture—not just great or iconic works, but those where design is manifest in practical concerns—are also aesthetic achievements. A well-built house, for example, is not a bunker but potentially a home—where the notion of it being a home has an aesthetic and moral valence that ideally contributes to the well-being of its inhabitants. Architecture is often judged in terms of aesthetic and technical, rather than moral, criteria. Yet, the view that judgments based on aesthetic criteria are independent of those based on moral criteria has a history of being challenged. The idea that the aesthetic value of an art work, including architecture, is independent of moral considerations, and so should properly be judged apart from such considerations—the view in philosophical aesthetics known as “aestheticism” or “autonomism”—is perhaps more easily disputed in architecture than in any other aesthetic endeavor. Even on the face of it, architecture impacts our daily lives in ways that are morally significant. Architecture’s concern with aesthetics is mediated in ways that, according to some, make it an essentially ethical discipline.

As we have seen, various understandings of the relation between form and function already contain ethically normative precepts. Adolf Loos’s functionalism, as implied in his claim about ornament and crime, is ethically as well as architecturally grounded. While some architectural theory remains focused on Vitruvius’s elements and the relation between form and function, contemporary discussion about the relationship between architecture (including landscape architecture, and other planning and design professions) and ethics (including social and political philosophy), has refocused the discussion in different terms.

Thus, Lagueux (2004) argues for an intrinsic connection between architecture and ethics, distinguishing this connection from art forms and professions in which, he argues, any connection with ethics is extrinsic. He claims that architectural problems are, at one and the same time, ethical problems and that the two, being intrinsically related though not identical, must be solved at the same time and in the same way. This alleged connection between architecture and ethics may be seen to be a reformulation or evolution of the Vitruvian problem, where the notion of function or utility (or essential function) is interpreted as irreducibly ethical in part, and the “ethical” is understood to include judgments about value—about what is “good” as well as about what is right.

Even if it is true that interventions in the urban landscape have ethical implications as Brain believes, this would not necessarily substantiate Lagueux’s claim that architecture should recognize its inherently ethico-political character. The two sets of problems might best be kept separate and, to a degree, resolved separately. Nevertheless, in practice there may be reason to believe he is right, even if no sharp distinction can always be made between architecture and other disciplines (for example, medicine and biology) as to whether ethical considerations are intrinsic.

Lagueux’s claim regarding architecture and ethics as opposed to other disciplines may seem implausible. Medicine, for example, inevitably confronts its practitioners with practical moral problems and dilemmas that must be considered in relation to the concrete details of the situation. Lagueux does not deny this, but claims that such moral problems remain moral problems and that there is no fundamental or intrinsic connection between the medical and ethical aspects of the problem. One might, for instance, bring in an ethical specialist for advice—as indeed is often the case. Lagueux needs to explain why he sees architecture as having resources for dealing with the moral issues it raises that medicine lacks. If doctors not trained in ethics cannot deal with the issues, what qualifies similarly untrained architects to do so?

Lagueux would say that insofar as architects are not ethically competent, they are also not architecturally competent. In other words, unlike the case of medicine, ethical and aesthetic problems are linked in such a way that ideally they must be resolved at one and the same time—even in the absence of any unique solution. Insofar as architecture (or the architect) does not have resources for dealing with the moral issues it fails as architecture. Lagueux’s claim is that, unlike the case of medicine, the architect qua architect requires ethical training because they cannot practice architecture without it. He does not also claim that architecture is unique in this respect as against all the other arts (for example cinematography), in that it alone has ethical and aesthetic issues intrinsically linked. In any case, even if one denies that Lagueux’s claim is universally true, one might accept it as characteristic of architecture.

Since Lagueux sees aesthetics and ethics as intrinsically connected in architecture in a way they are not in other disciplines, architecture for Lagueux is characterized by the way it presents the practitioner with ethical problems linked to aesthetic ones. For example, the placement of windows and doors in a building should be done in such a way that it satisfies both aesthetic considerations like pleasing views, as well as ethical ones such as due concern for neighbors’ privacy. A more complex example would be designing a public atrium as part for a corporate complex where due consideration is given on the one hand to its utility as public accessible space—responding to the needs, desires and values of those who inhabit and traverse the space—and on the other to perhaps the conflicting concerns, or incommensurate values, of those inhabiting neighboring work environments. A more abstract case yet might be the construction of a public space—a park or a square—designed to be aesthetically pleasing but also, by means of its design, to promote certain civic and democratic values.

3. Philosophical Movements and Ideas in Architecture

As a field that engages multiple disciplines, philosophy of architecture can be aligned with currents of inquiry seen across the humanities. Before Pevsner penned his 1949 editorial in Architectural Review and to a greater extent since the 1960’s, architectural theory has provided a philosophical gloss for architectural criticism, design practices and education. Much of what counts as scholarship on architecture has come to resemble a history of philosophical ideas. The changeable terrain and contingencies of practice have resulted in a continuing critical reappraisal of the discipline’s terms, and intellectual and aesthetic traditions (including the Vitruvian triad and its legacy). Theory has been informed to a large extent by continental European philosophy. Movements such as German idealism, phenomenology, structuralism and post-structuralism, the Frankfurt School, neo-Marxism, psychoanalytic theory, and feminist and deconstruction (literary) theory have found an audience among architectural historians, theorists and practitioners at the “cutting edge” of design.

Arguably, the autonomy of architecture, like other creative arts (for example, film), is made questionable by what some have described as the indeterminate, mixed or “hybrid” character of the discipline and by the critical writing architecture attracts. This includes theory that treats creative disciplines as primarily demonstrative of philosophical truths, rather than productive of ethical insights into the human condition. In his effort to provide a more comprehensive account of the field, Andrew Benjamin, who has written extensively on architecture and the continental tradition, proposes to “think the particularity of the architectural” and devise a uniquely “architectural philosophy” (2000, vii).

Whether Benjamin’s undertaking, or any other, has provided the kind of framework for the philosophy of architecture that Pevsner desired, is open to question. Much depends on how “philosophy” is itself understood and where one stands in relation to history, theory, or practice. Adopting an “honorific” conception of philosophy, for instance, privileges its modes of interrogation as the means to clarify and adjudicate claims of truth arising in these areas. Another view, common in schools of architecture and shared by practitioners seeking intellectual rigor for their work, requires that a “philosophy”—guiding principles or a theoretical exegesis—accompany each design project.

a. Idealism and Architectural History

Idealism, specifically the movement with origins in late 18th and early 19th century German philosophy and bearing the imprimatur of Kant and especially Hegel, is significant for treating works of architecture as objects of our consciousness, their meaning and value being variable, though ultimately determined by the mind’s responsiveness to the material world. As McQuillan points out in his article on German Idealism, the movement is remarkable for its systematic treatment of several philosophical disciplines, including aesthetics, to which one can add art history which followed as a recognizable discipline later.

Architectural history is largely an offshoot of art history. German idealist historians writing in the mid- to late 19th and early 20th centuries (Schnaase, Semper, Wölfflin and Warburg and others; see Podro 1982) contributed much to the formation of art and architectural canons. Critical historiography on architecture developed alongside Hegelian notions of Zeitgeist (the spirit of the age manifest in art forms) and Weltanschauung (the notion that art represents a people’s worldview). Philosophical debate on the nature of architecture was given impetus by comparative analyses fostered by this tradition and the view that saw art forms categorised according to their purported capacities to manifest universal truths.

The influence of Hegel and idealism can be seen in Pevsner’s writing on the origins of the modern movement, notably in Pioneers of the Modern Movement (1936). In this seminal text, developments in architectural form manifest an emerging functionalist aesthetic and spirit indicative of the modern age. There is a strong sense of historical determinism behind this movement. Hence, in Pioneers there is dramatic language of “stages being set,” of heroic architects “appearing on the scene” and of designs “ahead of their time” (122, 132, 136). Historical determinism imposes a particular challenge to expectations for an architect’s autonomous control of a work and the capacity of a cohort of avant garde architects to initiate a new direction for contemporary design. Idealism’s legacy is perhaps best seen in its contribution to subsequent philosophical movements (like Husserl’s phenomenological idealism) and in the broad expectation that art and architecture contribute to understanding the historical moment.

b. Phenomenology and Architectural Experience

The particular nature and significance of architecture has often been discussed in terms of ways that buildings (or some of them anyway) can be experienced. Among philosophers and architectural theorists and designers, there is the broad expectation that different types of buildings, and public and private spaces, engage human perceptions and feelings in ways that both shape and are shaped by patterns of human behaviour and self-consciousness. There is a corresponding and overlapping set of interests, expressed within and outside the academy, questioning how cities allow for distinctive forms of “urban experience” or how certain kinds of public or monumental architecture or “heritage” precincts make for an experience of history that is distinctive, stimulating, and productive of good citizenship. Social, political and ethical contexts for architecture and urban design are raised by such studies as well as others.

From the perspective of moral philosophy, a subset of aesthetic concerns focuses specifically on what an “aesthetic experience” of buildings might be as a means of grounding claims of value. For instance, Michael Mitias (1999) proposes that an “adequate analysis” of the experience of architecture is possible and this is the “safest road” to a reasoned understanding of what architecture is about, for evaluating it and for finding principles of education in architectural aesthetics. He questions:

Under what theoretical and perceptual conditions it is possible to experience, appreciate and evaluate a building as an architectural integrity, in its own terms, without appealing to, or relying on, an external or implied philosophical, ideological, political, or social agenda? (61)

Thoughts on what an experience of architecture may be, acquire greater conceptual rigour in the context of phenomenology. This has been understood as either a disciplinary field in philosophy alongside other studies like ontology and epistemology, logic and ethics, or as a more specific movement in the history of philosophical ideas informed by, among others, Edmund Husserl and Martin Heidegger, Maurice Merleau-Ponty and Gaston Bachelard. Phenomenology studies the “appearances of things, or things as they appear in our experience or the ways we experience things, [and] thus the meaning things have in our experience” (Smith 2003). When studying buildings, particularly for their existential and transcendental value, the phenomenologist emphasises the subject, subjective or first person view of architecture as a condition of conscious awareness. In the work of Christian Norberg-Schulz (1980 [1979]), phenomenology is concerned with the concept of the “genius-loci” whereby the distinctive character or spirit of a place is reinforced by patterns of human settlement and acts of building and dwelling. Urban form, architecture and contrived landscapes that aim at “place-making” elicit a similar concept.

Phenomenology is an influential movement in architectural theory, though its interpretation and application is far from univocal. Its proponents vary in their commitment to its key terms and thinkers, and take its applications and implications (tendencies like transcendentalism or existentialism) in different directions. For Alberto Perez-Gomez (1983), for instance, transcendental phenomenology informs a particular perspective on modernism, supporting the contrast of creative “poiesis” and meaning, on the one hand, and architecture’s representation as plans and drawings—along with its rationalised construction and role as consumerist object—on the other. Theorists who contribute to this and parallel lines of thinking include Juhani Pallasmaa, Dalibor Vesely and Karsten Harries. Architectural practitioners like Steven Holl and Peter Zumthor cite the influence of phenomenology upon their designs.

Phenomenology is influential on architecture, though it provides no clear and categorical definition of architecture. This is partly because there are social grounds for experiencing buildings and semantic considerations that characterise architectural aesthetics according to cultural differences, including discriminations between “high” and “low” art. Whether the function of buildings makes for a different kind of experience from the pleasure derived from their beauty or perception of any likely “architectural integrity” they may have is also at issue. So too is the possibility there are conditions that make for an experience of “bad” architecture. Consider whether places like detention centres can be improved by designing with the genius-loci in mind.

c. Structuralism and Meaning

 Widely attributed to the pioneering work of Ferdinand de Saussure, structuralism was a movement introduced into a number of academic disciplines in the 1950s and 60s. It was an outgrowth of interests in linguistics, semiotics, and allied studies of language. It was influential in anthropology, with work by Claude Lévi-Strauss. Structuralism’s subsequent appeal for architectural theorists was largely due to its promise of a more philosophical, systematic or “scientific” framework for what had long been presupposed (some believe since the Renaissance; others since Vitruvius) that architecture was akin to language and that, like written text, architectural form exhibited a grammar-like structure for conveying meaning. According to this reasoning, material details (classical orders, ornament, and so forth) of buildings or series of building facades, are conceived as metonymic wholes, possessing semantic content and conceivably ethical worth (valence) for communicating meanings and values within social formations and from one generation to the next. Victor Hugo more or less espoused the idea in Notre-Dame de Paris where he bemoaned the arrival of the printing press and cheaply reproduced books. He counterpoised the fluidity and unreliability of the written word with the heyday of architecture in the form of the gothic cathedral on which he believed meanings were artistically manifest in stone—thus acquiring greater permanency and social relevance.

Linguistic structuralism promised not so much a philosophy of architecture; rather, it required that study of architectural aesthetics conform to the model and ideal of language and adhere to what amounts to an empiricist conception of knowledge. Structuralism’s methods worked to establish a fundamental opposition between (i) architectural form and function—privileging the communicative capacity of architectural aesthetics over a building’s other performative roles (as structure, shelter, or its function as a commodity, and so forth)—and (ii) between architectural form as a category of signifiers, and a largely pre-existing context of potentially meaningful artifacts, signified entities or referents.

Accordingly, Umberto Eco (1968) effectively recast the Vitruvian terms of form and function as elements in a culturally-grounded system of architectural signification, thereby denying the precedence and determining influence the modernists gave to one term over the other:

In other words, the principle that form follows function might be restated: the form of the object must, besides making the function possible denote that function clearly enough to make it practicable as well as desirable [emphasis in original], clearly enough to dispose one to the actions through which it would be fulfilled. (186)

Eco moves to distinguish between primary (denotative) and secondary (connotative) functions, neither more important than the other, but each dependent upon the other to form a “semiotic mechanism” (188). Hence, the form of either a barn or a church allows them to function as habitable spaces of a kind (their primary function) and these forms denote this purpose. Their doors “tell” us there is space inside; their windows “tell” us there is light with which to see and so forth. The combination and arrangement of building details work alongside cultural codes to connote (their secondary function) that the first building type, the barn, is just that, merely a building, while the second possesses architectural significance. Roland Barthes complicates the idea that architectural signs are composed by the one-to-one correspondence between signifiers and signifieds. In “Semiology and the Urban” (1971) he emphasizes the transience of urban life so that meanings are not fixed by such a correlation, but temporary and mobile.

Among architectural theorists and practitioners, renewed emphasis in the 1970s and early 80s on the meaningful interpretation of architectural and urban typologies (the classification and comparison of the formal and visual characteristics of building types and urban forms) reinforced the linguistic model. Reyner Banham (in Baird and Jencks, 1969, 101) rejected the move, believing that arguments in support of architectural semantics were merely promoting a new ideology of monumentality in the service of social elites rather than a more rational formalism and egalitarian (that is, functionalist) approach to design. Contributions to the debate over meaning versus functionalism in architecture were published in the first book in English on the subject, Meaning in Architecture (Baird and Jencks, 1969). Additional titles promoting the language of architecture appeared in quick succession, including Venturi, Brown & Izenour (1972) and Jencks (1977). Arguably, Banham’s functionalism and egalitarianism were pushed aside in preference for the stylistic eclecticism and populism allowed for in these books.

Borrowing from Noam Chomsky’s linguistics, Peter Eisenman began a series of experimental projects in the 1970s. These were primarily small houses designed with highly complex forms and models resembling abstract geometric compositions. Though the projects were often accompanied by equally complex theoretical exegeses, Eisenman nonetheless believed that his viewers were able to understand their meaning as they were purportedly derived from the same linguistic and syntactical structures used to express everyday thoughts. The architect-theoretician tried to relate formalism and linguistics logically, distinguishing between meanings that were semantic and those that were syntactical or integral to architecture’s coherence as an object. For Eisenman, formalism was the displacement of the semantic content of a design with the syntactic. The promise of freedom attributed to this displacement underscored Eisenman’s desire to create architecture that was autonomous and free from external constraints arising from pre-established meaning and practical necessity. His view of the “paradoxical nature” of architecture prefigured his subsequent interests in deconstruction and theories of conceptual and “cardboard” (unbuilt) architecture. This includes architectural drawings and plans for projects that may never be built or could not be built.

Structuralism is largely appraised today for the movements that followed and perhaps were reactions to it, variously assembled under the banners of “postmodernism” or “post-structuralism.” Its demise was perhaps due in part to the cumbersome vocabulary developed to describe systems of signification (de Saussure’s terms of and distinction between langue and parole, the division of “signs” into “signifiers” and “signifieds,” Eco’s denotative and connotative functions, and so forth). Questions also arise about the reality behind these terms and equally obscure concepts like Eisenman’s “wellness.” While the vocabulary and concepts might provide the theorist with a framework for describing architectural meanings, they are also largely a-historical and overly formulaic. Structuralism leaves us with the question of whether the so-called “paradoxical nature” of architecture as a system of signification can be reconciled with its determination by, and determining influence on, power and politics.

i. Postmodernism

Drawing on heterogeneous writing, principally by Jean-François Lyotard and Jean Baudrillard, and popularized by architect-critics Charles Jencks and Charles Moore, the underlying aims, scope, and methods of postmodernism are subject to considerable debate and contestation (Habermas 1982; Jameson 1991). Defying easy description, Hal Foster, in The Anti-Aesthetic (1983), nonetheless identifies two distinct and opposing strains of thought behind postmodernism’s claims. Together, they account for the equally imprecise and ambivalent position of the movement in the history of ideas about architecture.

One the one hand, postmodernism was a reactionary movement; it encouraged opposition to certainties that grounded modernism and modern architecture; it challenged the idea that social progress was adjunct to rational design, for instance, or that building form was relatable to function in a pre-determined way or that any epistemology like semiotics could fully encompass the fluidity, ambiguity and impermanence of meaning. This variant of postmodernism accepted the status quo and rejected, notably in work by Jencks and Moore, the “high art” status of International modernism. It embraced populism based on architectural aesthetics characterized by historicist motifs and bricolage. On the other hand, postmodernism can be seen as a critical stance towards modernism that sought to reappraise its claims to truth, as well as reinforcing, perhaps indirectly, the semiotician’s undertaking to provide a more thorough account of architectural meaning.

Along with joining the chorus of scholars asking “What was postmodernism?” it is worth standing back from the particular claims of its leading figures and examining how philosophical movements such as this have engaged “the question of history” (Attridge and others 1987) and utilize (or eschew) forms of historical investigation to produce insightful architectural criticism or novel design styles. One can investigate how philosophical concepts are appropriated and possibly misinterpreted by practitioners when the practical demands of clients and corporate patrons intercede or architectural media weighs in with a market for design novelty and appealing visual imagery.

d. Post-structuralism and Power

Post-structuralism is another interdisciplinary movement that emerged in the 1970s and 80s as an extension and critique of structuralism. Its multiple strains are no more easily characterized than postmodernism. Post-structuralism is associated with writing by Michel Foucault, Jacques Derrida, Julia Kristeva, Gilles Deleuze and other continental philosophers. Derrida’s work on deconstruction further popularized textual analyses for studies in the arts and humanities. Inspiring much architectural criticism and coinciding with highly publicized projects like Bernard Tschumi’s competition winning scheme for Parc de la Villette in Paris (1982) and Zaha Hadid’s unrealized design for Hong Kong’s Peak Club (1983), deconstruction encouraged further unpacking of modernism’s traditions, particularly functionalism. It promised designers a new generative grammar based on the ambiguity, fragmentation, and collision of architectural elements in which systems of representation and habitation were recognized as fluid and contingent. However, the popular reception of deconstruction as an exciting new architectural style may have overshadowed the movement’s critical impetus to firmly position language and meaning within a social matrix. This was enlivened by the dialectic of presence and absence whereby humankind retained a measure of freedom to shape its own identity.

Foucault’s work on knowledge and power develops a key theme of post-structuralism, though he does this in a distinctive (and, for some, idiosyncratic) way using methods that challenge conventional boundaries between modes of philosophical, historical and material analyses. The uneven reception of his oeuvre among architectural historians and theorists is perhaps due to the relatively few works containing explicit references to architecture or architects. Foucault’s analysis of the Panopticon prison in Discipline and Punish (1975) is well known and inspired many studies of space, knowledge and power in the context of disciplinary society.

In one frequently cited interview, Foucault (1982) left his readers with no doubt about the limited agency architectural greats like Le Corbusier or everyday practitioners have in shaping this milieu. It is one where social engineering results not from forms that follow functions (or vice versa) but from techniques of power that engage multiple levels of human experience (material, conceptual and authoritative, and possibly others):

After all, the architect has no power over me. If I want to tear down or change a house he built for me, put up new partitions, add a chimney, the architect has no control. So the architect should be placed in another category—which is not to say that he is not totally foreign to the organization, the implementation, and all the techniques of power that are exercised in a society. I would say that one must take him—his mentality, his attitude—into account as well as his projects, in order to understand a certain number of the techniques of power that are invested in architecture, but he is not comparable to a doctor, a priest, a psychiatrist, or a prison warden. (247-48)

Other works by Foucault have a bearing on philosophy of architecture. In an early work, The Order of Things (1966), for instance, Foucault adopted a quasi-structuralist approach to write an “archaeology of human reason” (also the subtitle of the book). Amongst other tasks he locates movements like phenomenology in a historical framework punctuated by ruptures in the representational structures or “epistemes” of Western discourse. Normally, the disciplines of ontology and epistemology would provide methods for this or similar analyses, but Foucault adopted a more radical approach; after all ontology and epistemology were themselves forms of philosophical inquiry with histories of their own and were already complicit in shoring up the phenomenologist’s claims to truth. His ambition, to stand apart from philosophy in order to see its deepest workings, made for a story of changing relations between signs and the things they came to signify. It was a history where new objects of knowledge appear, and old ones were lost. This allowed for concepts like “life,” “labor,” and “language” to emerge; to provide new foundations for sciences (biology, economics, and linguistics) to be formed, and to describe the human condition. On Foucault’s account, these terms helped shape a distinctly modern framework for an understanding of humanity and, arguably, the built environment and architecture as well.

4. Selected Lines of Inquiry into Philosophy of Architecture

Topics and lines of inquiry into philosophy of architecture have engaged one or more of the preceding movements and illustrate the breadth, intellectual richness, and relevance of the field. Consider a few of these.

a. Architecture and Representation

The question raised early in this article, one that invites further inquiry and positions philosophy of architecture as propositional, “What is architecture?” begs consideration of the relations between the material substances and physical properties of buildings and their representation through various media. The key issue is how the materials, tools and techniques of architecture partly or wholly determine what architecture is or can be. Is designing and visualising buildings the same as thinking about their ethical or other value? Does visualising an ideal building or urban form contribute to its meaning or form part of an architectural experience?

The historical development, prevalence and the popular appeal of wide-ranging media, including conventional design and construction drawings and models, new digital media and even film, have shaped architectural discourse and underscored forms of professional expertise. Ways of representing or producing images of buildings, like plan-metric (two-dimensional), orthogonal or perspective drawings and, more recently, computer renderings of complex building forms, have supported various design and construction practices. These have also encouraged speculation on the capacity of architecture to embody ideas and entail a distinctive way of conceptualising the world. (For instance, Winters 2011 writes on the critical attitude cultivated by “paper”—delineated, but unbuilt or unbuildable—architecture.) This is a tendency that borrows reasoning from art history, studies of iconology and meaning and influential books like Erwin Panofsky’s Perspective as Symbolic Form (1927). The Euclidean character of visual space has been questioned and by some accounts, superseded by new visual regimes that challenge conventional understanding of relations between the architectural object and the viewing/inhabiting subject. Even building diagrams, popularly caricatured as the architect’s calling card when scribbled on dinner table napkins and taken to indicate a unique kind of self-reflection, can raise questions about architecture’s intertwined practical and philosophical aspects.

Studies of architectural media have also prompted philosophical reflection on architecture as affording understanding of transcendental values and existential meaning. For some architectural theorists, for instance, plans and drawings are representations of a second order, seemingly distant from all sense of time and place. For others, new digital media allow not only for the easy visualisation, rapid prototyping and construction of novel architectural forms, but also provide insight into the human condition in an era of globalisation and rapid technological change. On the one hand, issues raised by Walter Benjamin’s much cited essay in “The Work of Art in the Age of Mechanical Reproduction” (1935) run parallel to concerns for architecture given the representation and mass-reproduction of building forms and their contribution to dominant global culture. On the other, the dynamism of fluid building forms, so-called architectural “blobs” and forms inspired by Deleuze’s interest in “the fold” or “folded” spaces (1988), promises unheralded opportunities for self-invention and social renewal.

The utility of “representation” as a trans-historical category of critical analysis (in architectural theory and cultural studies, generally) is accompanied and in some cases countered by philosophical reflection on time, temporality, and transience whereby two- and three-dimensional images of buildings possess only limited value in themselves. In architecture, these themes are evident in arguments for the essential timelessness and fundamental intelligibility of Classicism (Porphyrios 1982) that renders it more than a style, or studies describing the physical characteristics of building materials and emphasizing the meaningfulness of weathering (Mostafavi and Leatherbarrow 1993). Building age and the register of weather, organic and human factors on timber, stone and other materials can be valorised as providing the necessary conditions for Heidegger’s concept and state of “being-in-the-world” whereby the alienation of human subjects from the material world of objects is overcome. These relatively recent studies are worth comparing to 18th and early 19th century aesthetic treatises on ruination, the sublime, and picturesque, though their provenance is not wholly attributable to them.

Conversely, on some accounts, architecture comes into its own when distanced from strict demands for functionality, conventional delineation, and commonly-held meanings (Benedikt 1991; Harbison 1991). Visionary schemes set in “cyberspace,” so-called “virtual” and “unbuilt” (also “paper”) architecture are described and valued for their intellectual content, provocative appeal, and their potential to liberate communities from what is construed as the deadweight of the past, historical building styles, and the conservatism of much architectural heritage. From this follow another set of issues discussed in the literature. One is whether or not an architectural work or proposition is complete when drawings are finished, independent of the design’s construction and prior to the project’s occupation and evaluation. If wholly propositional or paper architecture possesses a kind of creative integrity, this raises questions about the necessary contribution (or otherwise) of the project’s sites and settings (physical and performative) to the design process. Is the architect first and foremost a visionary, rather than merely a technician? If so, how can communities come to understand, share, and assess the architect’s largely utopian mission?

b. Architectural Value and Heritage

Questions concerning the integrity of a work of architecture and the autonomy of the architect as a particular kind of expert or visionary correspond to those asked about other kinds of artworks. Can a work of virtual architecture, a painter’s cartoon or unfinished symphony make a lasting contribution to an artistic canon, or must they invariably be “read” as secondary in importance, interpreted according to existing representational or technical (that is, social) norms? Is unbuilt architecture best left as it is, unrealised, or an unfinished masterpiece best left incomplete, thereby allowing audiences—and posterity—the freedom to fill in the missing pieces? If the latter proposition is correct, then is the meaning of an artwork invariably a social construct? Does the architect, artist, or composer ever have a lasting claim (in term of its meaning) over their work, as they intended it to be?

These questions highlight philosophical issues concerning a creative work’s contribution to culture and heritage, and they draw further attention to differences between architecture and other forms of art. Debates surrounding the preservation, restoration or adaptive re-use of iconic buildings, for instance, show up technical, social, and political contexts governing architectural value that may not be applicable to other artistic genres. The “Salk Controversy” is a case in point, where disagreement arose over plans for a building addition to Louis Kahn’s Salk Institute at La Jolla, thought by some to contradict the architect’s original design intention (Spector 2001, 166-84). The debate shows up differences in views regarding a building’s past and present integrity as an artistic object and how and to what extent the creative vision of an architect should be privileged over the needs of clients and users. Is the heritage value of either the Salk Institute building, or the stature of the architect Louis Kahn diminished by such additions?

Discourse on architectural heritage was shaped by historical figures like Viollet-le-Duc, Ruskin, and Pugin. Working to refurbish France’s medieval cathedrals, Viollet-le-Duc drew and then followed a fine line between restoring building fabric to its “original” (though invariably hypothetical) condition, on the one hand, and adapting the buildings to evolve according to modern standards of function, taste, and aesthetics, on the other. Drawing together both of these divergent positions was an emerging imperative that architecture, both old and new, should be relevant for the times. This perspective is taken up in architectural theory by Giedion and Harries, among others. This and additional views on architectural ethics and heritage have been enacted by the establishment of institutions such as the National Trust (UK) and its offspring of national and regional heritage councils, the International Council on Monuments and Sites (ICOMOS), and Docomomo, charged with the protection and preservation of modern architecture and urbanism. Awareness of the fluidity of heritage as grounds for questioning the relativity of architectural values has been sharpened by debates generated by controversial demolition and rebuilding projects. Famous episodes include the protracted commercial redevelopment of Paternoster Square at St. Paul’s Cathedral, London (1980s-90s); Venturi and Brown’s postmodernist addition (1991) to the National Gallery on Trafalgar Square (replacing the design famously condemned by Prince Charles as a “monstrous carbuncle”), and the rebuilding (completed 2005) of the Frauenkirche, Dresden, to include evidence of damage from Allied carpet bombing during the Second World War.

c. Environmental Issues

While perhaps always present in some measure, the ethical dimensions of architecture have never been as public and as apropos to the civic and political climate as in the early 21st century. Warwick Fox sees this situation as mainly the result of increasing environmental problems and a concern with the built environment as a heretofore neglected aspect of environmental ethics (2000, 1–12). Fox is partly right, but to see the relation between architecture and ethics exclusively in terms of environmental ethics, as commonly understood, is too narrow. For one thing, drawing on forms of historical, theoretical and practical (also professional) knowledge, architecture, more than most other humanities disciplines, is concerned with multiple conceptions of and concerns for the environment. Viewed as subjects of philosophical inquiry, distinctions between “architecture” and the “built environment,” and between either of these terms and “nature” or “the natural environment,” beg for ontological and epistemological elucidation.

Many of the philosophical concerns about architecture may be seen as a subset or variant of concerns for the built environment. They tend to arise in a cultural sphere, bound by interpretative traditions, entailing the formative concepts, historicity and rhetorical conventions, of the discipline. The primary function of the built environment seems to be to provide for habitation and the requisites of life. The question thus arises as to whether this primary function takes precedence over the aesthetic functions of architecture, specifically expectations for its artistry or meaning. Should what seems to be the primary function of the built environment to provide for habitation and the requisites of life take precedence over the aesthetic functions of architecture, specifically expectations for its artistry or meaning?

Moreover, the challenges architects and allied design professionals (particularly planners and urban designers) face in responding to demands for environmentally sustainable buildings with reduced energy consumption and lower carbon emissions, and for cities with greater resilience to global climate change, raise additional philosophical and ethical issues that Vitruvius and his annotators could hardly have imagined. Many of these raise questions about the meaning and scope of sustainability. Is it a matter of science and building technology or behavior—or both? Can buildings be designed sustainably in societies geared for endless growth and consumption? Can a city be made resilient to environmental disaster if this requires the pre-emptive destruction of neighborhoods in vulnerable areas—and possibly worsened levels of social injustice and inequality that may result?

While it may be assumed these concerns and issues have only appeared at the beginning of the 21st century, there are broader, longstanding and overlapping conceptual and practical contexts for locating them historically. In histories of ideas bearing on philosophy and environment (also nature), for instance, (Pratt et al 1999), one learns of the importance of arguments for the uniqueness of living species based on the geographic regions and climates they inhabit. In this regard, today’s environmentalists can be seen as developing thoughts expressed by natural theologians or geographers like Alexander Humboldt (1769-1859) or systemic botanists like John Hutton Balfour (1808-1884) who described life as a process emerging from interactions between living beings and their surroundings.

Humboldt, Balfour, Darwin, and others contributed to the scientific formulation of ecology as well as spatio-temporal frameworks whereby newly established facts of biological existence could also be used to describe urban societies and environments. Arguably, these frameworks contributed to interests in vernacular architecture and the model of “the primitive hut” (Vidler 1987) as these were interpreted as manifesting links between building forms, patterns of human settlement, and distinctive eras. Advancements in building technology over the course of the 19th and 20th centuries, particularly in the areas of sanitation, illumination, heating, and ventilation, reinforced a largely functionalist view of the interrelationship of building interiors, urban spaces and human wellbeing.

According to one line of thinking, our scientific and technological orientation towards control of the natural world is one contributing factor, not the solution, to environmental crises. The logical conflict of different criteria available to measure a building’s ecological sustainability, for instance (entailing its consumption of energy for lighting and heating versus the energy embodied in its materials and construction), demonstrates the limitations of conventional instrumental or practical reasoning. However, it seems fanciful to anticipate that another philosophy of nature and the built environment will appear—one that is more than merely functionalist and non-individualistic or post-humanistic—to underscore effective environmental activism and remediation.

The developments affecting architectural practices in the 21st century arise from the awareness of the link between the environment and human flourishing, though these developments are reducible to no one single concept about the environment. These include growing unease over hitherto unforeseen consequences of building technology and concomitant processes of industrialism and urbanization. Issues range from local ones such as “sick building syndrome,” pollution, and revelations of the toxicity of building sites, to broader concerns arising from the global warming and the depletion of natural resources, including energy resources. These developments have prompted new movements among design practitioners. They include calls for “green architecture” with its emphasis on sustainability and purportedly sustainable practices such as “cradle to cradle” design where building materials are chosen with their life cycles and future recyclability in mind. On a larger scale there is the move towards the “ecological restoration” of natural and urban landscapes aimed at reversing the consequences of environmental degradation or limiting the impacts of future flooding, bushfires and other disasters.

These and other developments directed towards more complete awareness, preservation or restoration of the environment have important subjective and ethical dimensions. These are evident not only in obvious political or design movements, but in ascetic—self-disciplining, restraining, and possibly abstaining—practices involving the design, furnishing, and maintenance of the home, the water-wise planting, and rigorous inspection of the suburban garden for invasive species and noxious weeds. What emerges from such practices is a relationship between thought and experience mediated by an understanding of environs, surrounds, spaces, and choice regarding possible ways of living in them.

d. Design Pedagogy

Given the lines of inquiry outlined in this article, it should become clear there is no one single relation between philosophy and architecture. Rather, there are likely multiple connections that make this an important multi-, inter- and trans-disciplinary field, and these connections can be brought to bear to consider the formation of architectural historians, theorists, and designers as particular kinds of intellectuals and “philosophical” agents with responsibilities for the built environment.

Consequently, the opening question “What is architecture?” leads to another. “What is an architect?” There are a variety of answers. One set of responses is to describe what an architect does, namely design, as a distinctive activity that is not only creative and imaginative, comparable to other forms of “art,” but also both critically and practically oriented. Donald Schön in his book The Design Studio (1985) coined the phrase “reflection-in-action” to describe design and, like many design educators, valued the design studio as a unique arena for cultivating creativity and innovation, and for devising novel solutions to social, technological and pragmatic problems. Indeed, it is a common view that design and studio practice are means of articulating just what the pressing problems of the day are or will soon be. There are many “philosophies” and metaphors of design, including descriptors such as “problem-setting” versus “problem solving” and “lateral thinking”—that old shibboleth of many devotees of design and “the creative industries.” There are also different ways of describing what the ideal design process should be (rational, but not linear; reiterative, cyclical, and so forth).

For Winters (2011), the value of the “paper” architecture (sketch designs, drawings, and other media representations of unbuilt and perhaps unbuildable work) routinely produced in schools is that it throws into sharp relief the capacity of architecture as a visual art to be infused with a “critical attitude” combining the Apollonian and the Dionysian conceptions of aesthetics. The first entails the disinterested contemplation of the creative object as form, and the second active participation in and self-formation through an aesthetic experience, so that:

The designed environment unfolds before us requiring our occupational presence to make it whole. It is in this sense that a work of architecture displays itself as a canvas upon which to project the systematic undertakings that are constitutive of a life, but unlike the blank canvas, this canvas has marked out across its surface patterns that present themselves as suitable accommodation for our endeavors. (67)

Like the Vitruvian triad or the phenomenologist’s favored concept of “poiesis,” many of these descriptions impose—rather than merely recognize—a particular ontological and epistemological order on design acts and, more or less, stress their reasonableness, reliability, and universal applicability. Conversely, it could be argued that the epithet “design” encompasses a number of different cognitive, imaginative, and creative acts; these have histories and institutional settings that cannot be reduced to one common denominator.

Aesthetics is not only grounds for connecting philosophy to architecture in a multi-disciplinary field. It is also commonly the chief vehicle for composing and teaching histories of architecture, for teaching design and assessing design outcomes, and often, for positioning a student’s ambitions at “the cutting edge” of design. Alertness to historical, social, and political contexts impacting our understanding of “design” begs greater openness towards the domain of “aesthetico-ethical” exercises that the activity and related metaphors and methodologies routinely entail. These include perceptions and discriminations of various kinds which make the built environment something to be considered, reflected, and acted upon—in the design studio but also more broadly and everyday, across society. Discriminations, such as between the form and function of a building, or between the “utilitas” or “venustas” of architecture are means whereby a wholeness of character, psychological closure or renewal of community is sought among other aspirations or, conversely, whereby our passions and desires for a wholeness of the self, closure, and community are subverted. Such discriminations are exercised by individuals occupying a number of subject positions, degrees of knowledge, and authority. They are acted upon in multiple and overlapping social and political arenas.

It is clear that the field of philosophy of architecture has much work cut out for it. However, given the admittedly only partial account of its concerns as outlined here, it is likely that the significance of what is in some ways merely a nascent subfield within both philosophy and architecture, will grow.

5. References and Further Reading

  • Attridge, D., G. Bennington, and R. Young, 1987, Post-structuralism and the question of history, Cambridge: Cambridge Universty Press.
  • Baird, B., and C. Jencks, 1969, Meaning in Architecture, London: Barrie and Rockliffe.
  • Ballantyne, A., 2011, “Architecture, Life, and Habitat,” The Journal of Aesthetics and Art Criticism, 69 (1): 43-49.
  • Barthes, R., 1971, “Semiology and the Urban,” reprinted in The City and the Sign: An Introduction to Urban Semiotics, M. Gottdiener and A. Lagopoulos (eds), New York: Columbia University Press, 1986, pp. 88-98.
  • Benedikt, M., 1991, Cyberspace: first steps, Cambridge, MA: MIT Press.
  • Brain, D., 2005, “From Good Neighborhoods to Sustainable Cities: Social Science and the Social Agenda of the New Urbanism,” International Regional Science Review, 28 (2): 217-238.
  • Brain, D., 2006, “Democracy and Urban Design: The Transect as Civic Renewal,” Places, 18 (1): 18-23.
  • Budd, M., 1995, Values of Art: Pictures, Poetry and Music, London and New York: Penguin.
  • Davies, S., 1994, “Is Architecture Art?,” in Philosophy and Architecture, M. Mitas (ed.), Amsterdam and Atlanta: Rodopi, pp. 31-47.
  • Deleuze, G., 1988, The fold: Leibniz and the Baroque, London: Continuum, 2006.
  • Eco, U., 1968, “Function and Sign: Semiotics of Architecture,” reprinted in Rethinking Architecture: A Reader in Cultural Theory, N. Leach (ed.), New York: Routledge, 1977, pp. 166-204.
  • Foucault, M. 1982, “Space, Knowledge and Power,” in The Foucault Reader, P. Rabinow (ed.), New York: Pantheon, 1984, pp. 239-256.
  • Fox, W., (ed.), 2000, Ethics and the Built Environment, London and New York: Routledge.
  • Gans, H., 1968, People and Plans: Essays on Urban Problems and Solutions, New York: Basic Books.
  • Giedion, S., 1947, Space, Time and Architecture, Cambridge, MA: Harvard University Press, 5th ed., 1974.
  • Goldblatt, D. and R. Padden 2011, “Introduction” to “The Aesthetics of Architecture: Philosophical Investigations into the Nature of Building,” special issue of The Journal of Aesthetics and Art Criticism, 69 (1): 1-6.
  • Graham, G., 1989, “Art and Architecture,” British Journal of Aesthetics, 29 (3): 248-257.
  • Habermas, J., 1982, “Modern and Post-Modern Architecture,” trans. H. Tsoskounglou, 9H, 4: 9-14.
  • Haldane, J., 1999, “Form, meaning and value: a history of the philosophy of architecture,” The Journal of Architecture, 4: 9-20.
  • Harbison, R., 1991, The built, the unbuilt, and the unbuildable: in pursuit of architectural meaning, London: Thames and Hudson.
  • Harries, K., 1997, The Ethical Function of Architecture, Cambridge, MA: The MIT Press.
  • Heidegger, M., 1951, “Building, dwelling, thinking,” in Poetry, Language, Thought, trans. A. Hofstadter, New York: Harper and Row, 1975, pp. 145-161.
  • Jameson, F., 1991, Postmodernism, or the Cultural Logic of Late Capitalism, London : Verso.
  • Jencks, C., 1977, The Language of Post-Modern Architecture, New York : Rizzoli, 1984.
  • Lagueux, M., 2004, “Ethics Versus Aesthetics in Architecture,” Philosophical Forum, 35 (2): 117-133.
  • Lawhon, L. L., 2009, “The Neighborhood Unit: Physical Design or Physical Determinism?,” Journal of Planning History, 20: 1-22.
  • Leccese, M. and K. McCormick, (eds.), 2000, The Charter of the New Urbanism, New York: McGraw Hill.
  • Loudon, J. C., 1822, An Encyclopaedia of Gardening, London: Longman, 1835.
  • Mitias, M., 1999, “The aesthetic experience of the architectural work,” Journal of Aesthetic Education, 33 (3): 61–77.
  • Mostafavi, M., and D. Leatherbarrow, 1993, On weathering: the life of buildings in time, Cambridge, MA: MIT Press.
  • Mueller, G., 1960, “Philosophy and Architecture,” AIA Journal, 34: 38-43.
  • Norberg-Schulz, C., 1980, c1979, Genius loci: towards a phenomenology of architecture, London: Academy Editions.
  • Parsons, G., and A. Carlson, 2008, Functional Beauty, Oxford: Clarendon Press.
  • Perez- Gomez, A., 1983, Architecture and the Crisis of Modern Science, Cambridge, MA: MIT Press.
  • Pevsner, N., 1936, Pioneers of the Modern Movement; renamed Pioneers of Modern Design, partly revised and rewritten edition, Harmondsworth: Penguin, 1975.
  • Podro, M., 1982, The Critical Historians of Art, New Haven, CT and London: Yale University Press.
  • Porphyrios, D., 1982, “Classicism is Not a Style,” reprinted in Classical architecture, London: Academy Editions, 1991.
  • Pratt, V., J. Howarth, and E. Brady, 1999, Environment and Philosophy, London and New York: Routledge.
  • Ruskin, J., 1849, The Seven Lamps of Architecture, New York: Noonday, 1974.
  • Scruton, R., 1979, The Aesthetics of Architecture, Princeton, NJ: Princeton University Press.
  • Smith, C., 1992, Architecture in the Culture of Early Humanism: Ethics, Aesthetics, and Eloquence 1400-1470, New York: Oxford University Press.
  • Smith, D. W., 2003, “Phenomenology,” in The Stanford Encyclopedia of Philosophy, E.N. Zalta (ed.), (Winter edition). Online. Available HTTP: <http://plato.stanford.edu/archives/win2003/entries/davidson/> (last accessed 25 June 2010
  • Spector, T., 2001, The Ethical Architect: the dilemma of contemporary practice, New York: Princeton Architectural Press.
  • Venturi, R., D. S. Brown, & S. Izenour, 1972, Learning from Las Vegas: The Forgotten Symbolism of Architectural Form, revised edition, Cambridge, MA: MIT Press, 1977.
  • Vidler, A., 1987, “Rebuilding the Primitive Hut,” in The Writing of the Walls, Princeton, NJ: Princeton University Press, pp. 7-21.
  • Watkin, D., 1984, Morality and Architecture, Chicago: University of Chicago Press.
  • Whyte, W., 2006, “How do Buildings Mean? Some Issues of Interpretation in the History of Architecture,” History and Theory, 45: 153-177.
  • Winters, E., 2001, “Architecture,” in The Routledge Companion to Aesthetics, B. Gaut, and D. Lopes (eds.), London and New York: Routledge, 2nd edition, pp. 655-667.
  • Winters, E., 2011, “A Dance to the Music of Architecture,” The Journal of Aesthetics and Art Criticism, 69 (1): 61-67.

 

Author Information

William M. Taylor
Email: bill.taylor@uwa.edu.au
University of Western Australia
Australia

and

Michael P. Levine
Email: michael.levine@uwa.edu.au
University of Western Australia
Australia

Art and Interpretation

picture of man looking at art objectsInterpretation in art refers to the attribution of meaning to a work. A point on which people often disagree is whether the artist’s or author’s intention is relevant to the interpretation of the work. In the Anglo-American analytic philosophy of art, views about interpretation branch into two major camps: intentionalism and anti-intentionalism, with an initial focus on one art, namely literature.

The anti-intentionalist maintains that a work’s meaning is entirely determined by linguistic and literary conventions, thereby rejecting the relevance of the author’s intention. The underlying assumption of this position is that a work enjoys autonomy with respect to meaning and other aesthetically relevant properties. Extra-textual factors, such as the author’s intention, are neither necessary nor sufficient for meaning determination. This early position in the analytic tradition is often called conventionalism because of its strong emphasis on convention. Anti-intentionalism gradually went out of favor at the end of the 20th century, but it has seen a revival in the so-called value-maximizing theory, which recommends that the interpreter seek value-maximizing interpretations constrained by convention and, according to a different version of the theory, by the relevant contextual factors at the time of the work’s production.

By contrast, the initial brand of intentionalism—actual intentionalism—holds that interpreters should concern themselves with the author’s intention, for a work’s meaning is affected by such intention. There are at least three versions of actual intentionalism. The absolute version identifies a work’s meaning fully with the author’s intention, therefore allowing that an author can intend her work to mean whatever she wants it to mean. The extreme version acknowledges that the possible meanings a work can sustain have to be constrained by convention. According to this version, the author’s intention picks the correct meaning of the work as long as it fits one of the possible meanings; otherwise, the work ends up being meaningless. The moderate version claims that when the author’s intention does not match any of the possible meanings, meaning is fixed instead by convention and perhaps also context.

A second brand of intentionalism, which finds a middle course between actual intentionalism and anti-intentionalism, is hypothetical intentionalism. According to this position, a work’s meaning is the appropriate audience’s best hypothesis about the author’s intention based on publicly available information about the author and her work at the time of the piece’s production. A variation on this position attributes the intention to a hypothetical author who is postulated by the interpreter and who is constituted by work features. Such authors are sometimes said to be fictional because they, being purely conceptual, differ decisively from flesh-and-blood authors.

This article elaborates on these theories of interpretation and considers their notable objections. The debate about interpretation covers other art forms in addition to literature. The theories of interpretation are also extended across many of the arts. This broad outlook is assumed throughout the article, although nothing said is affected even if a narrow focus on literature is adopted.

Table of Contents

  1. Key Concepts: Intention, Meaning, and Interpretation
  2. Anti-Intentionalism
    1. The Intentional Fallacy
    2. Beardsley’s Speech Act Theory of Literature
    3. Notable Objections and Replies
  3. Value-Maximizing Theory
    1. Overview
    2. Notable Objections and Replies
  4. Actual Intentionalism
    1. Absolute Version
    2. Extreme Version
    3. Moderate Version
    4. Objections to Actual Intentionalism
  5. Hypothetical Intentionalism
    1. Overview
    2. Notable Objections and Replies
  6. Hypothetical Intentionalism and the Hypothetical Artist
    1. Overview
    2. Notable Objections and Replies
  7. Conclusion
  8. References and Further Reading

1. Key Concepts: Intention, Meaning, and Interpretation

It is common for us to ask questions about works of art due to puzzlement or curiosity. Sometimes we do not understand the point of the work. What is the point of, for example, Metamorphosis by Kafka or Duchamp’s Fountain? Sometimes there is ambiguity in a work and we want it resolved. For example, is the final sequence of Christopher Nolan’s film Inception reality or another dream? Or do ghosts really exist in Henry James’s The Turn of the Screw? Sometimes we make hypotheses about details in a work. For instance, does the woman in white in Raphael’s The School of Athens represent Hypatia? Is the conch in William Golding’s Lord of the Flies a symbol for civilization and democracy?

What these questions have in common is that all of them seek after things that go beyond what the work literally presents or says. They are all concerned with the implicit contents of the work or, for simplicity, with the meanings of a work. A distinction can be drawn between two kinds of meaning in terms of scope. Meaning can be global in the sense that it concerns the work’s theme, thesis, or point. For example, an audience first encountering Duchamp’s Fountain would want to know Duchamp’s point in producing this readymade or, put otherwise, what the work as a whole is made to convey. The same goes for Kafka’s Metamorphosis, which contains so bizarre a plot as to make the reader wonder what the story is all about. Meaning can also be local insofar as it is about what a part of a work conveys. Inquiries into the meaning of a particular sequence in Christopher Nolan’s film, the woman in Raphael’s fresco, or the conch in William Golding’s Lord of the Flies are directed at only part of the work.

We are said to be interpreting when trying to find out answers to questions about the meaning of a work. In other words, interpretation is the attempt to attribute work-meaning. Here “attribute” can mean “recover,” which is retrieving something already existing in a work; or it can more weakly mean “impose,” which entails ascribing a meaning to a work without ontologically creating anything. Many of the major positions in the debate endorse either the impositional view or the retrieval view.

When an interpretative question arises, a frequent way to deal with it is to resort to the creator’s intention. We may ask the artist to reveal her intention if such an opportunity is available; we may also check what she says about her work in an interview or autobiography. If we have access to her personal documents such as diaries or letters, they too will become our interpretative resources. These are all evidence of the artist’s intention. When the evidence is compelling, we have good reason to believe it reveals the artist’s intention.

Certainly, there are cases in which external evidence of the artist’s intention is absent, including when the work is anonymous. This poses no difficulty for philosophers who view appeal to artistic intention as crucial, for they accept that internal evidence—the work itself—is the best evidence of the artist’s intention. Most of the time, close attention to details of the work will lead us to what the artist intended the work to mean.

But what is intention exactly? Intention is a kind of mental state usually characterized as a design or plan in the artist’s mind to be realized in her artistic creation. This crude view of intention is sometimes refined into the reductive analysis one will find in a contemporaneous textbook of philosophy of mind: intention is constituted by belief and desire. Some actual intentionalists explain the nature of intention from a Wittgensteinian perspective: authorial intention is viewed as the purposive structure of the work that can be discerned by close inspection. This view challenges the supposition that intentions are always private and logically independent of the work they cause, which is often interpreted as a position held by anti-intentionalists.

A 2005 proposal holds that intentions are executive attitudes toward plans (Livingston). These attitudes are firm but defeasible commitments to acting on them. Contra the reductive analysis of intention, this view holds that intentions are distinct and real mental states that serve a range of functions irreducible to other mental states.

Clarifying each of these basic terms (meaning, interpretation, and intention) requires an essay-length treatment that cannot be done here. For current purposes, it suffices to introduce the aforesaid views and proposals commonly assumed. Bear in mind that for the most part the debate over art interpretation proceeds without consensus on how to define these terms, and clarifications appear only when necessary.

2. Anti-Intentionalism

Anti-intentionalism is considered the first theory of interpretation to emerge in the analytic tradition. It is normally seen as affiliated with the New Criticism movement that was prevalent in the middle of the twentieth century. The position was initially a reaction against biographical criticism, the main idea of which is that the interpreter, to grasp the meaning of a work, needs to study the life of the author because the work is seen as reflecting the author’s mental world. This approach led to people considering the author’s biographical data rather than her work. Literary criticism became criticism of biography, not criticism of literary works. Against this trend, literary critic William K. Wimsatt and philosopher Monroe C. Beardsley coauthored a seminal paper “The Intentional Fallacy” in 1946, marking the starting point of the intention debate. Beardsley subsequently extended his anti-intentionalist stance across the arts in his monumental book Aesthetics: Problems in the Philosophy of Criticism ([1958] 1981a).

a. The Intentional Fallacy

The main idea of the intentional fallacy is that appeal to the artist’s intention outside the work is fallacious, because the work itself is the verdict of what meaning it bears. This contention is based on the anti-intentionalist’s ontological assumption about works of art.

This underlying assumption is that a work of art enjoys autonomy with respect to meaning and other aesthetically relevant properties. As Beardsley’s Principle of Autonomy shows, critical statements will in the end need to be tested against the work itself, not against factors outside it. To give Beardsley’s example, whether a statue symbolizes human destiny depends not on what its maker says but on our being able to make out that theme from the statue on the basis of our knowledge of artistic conventions: if the statue shows a man confined to a cage, we may well conclude that the statue indeed symbolizes human destiny, for by convention the image of confinement fits that alleged theme. The anti-intentionalist principle hence follows: the interpreter should focus on what she can find in the work itself—the internal evidence—rather than on external evidence, such as the artist’s biography, to reveal her intentions.

Anti-intentionalism is sometimes called conventionalism because it sees convention as necessary and sufficient in determining work-meaning. On this view, the artist’s intention at best underdetermines meaning even when operating successfully. This can be seen from the famous argument offered by Wimsatt and Beardsley: either the artist’s intention is successfully realized in the work, or it fails; if the intention is successfully realized in the work, appeal to external evidence of the artist’s intention is not necessary (we can detect the intention from the work); if it fails, such appeal becomes insufficient (the intention turns out to be extraneous to the work). The conclusion is that an appeal to external evidence of the artist’s intention is either unnecessary or insufficient. As the second premise of the argument shows, the artist’s intention is insufficient in determining meaning for the reason that convention alone can do the trick. As a result, the overall argument entails the irrelevance of external evidence of the artist’s intention. To think of such evidence as relevant commits the intentional fallacy.

There is a second way to formulate the intentional fallacy. Since the artist does not always successfully realize her intention, the inference is invalid from the premise that the artist intended her work to mean p to the conclusion that the work in question does mean p. Therefore, the term “intentional fallacy” has two layers of meaning: normatively, it refers to the questionable principle of interpretation that external evidence of intent should be appealed to; ontologically, it refers to the fallacious inference from probable intention to work-meaning.

b. Beardsley’s Speech Act Theory of Literature

Beardsley at a later point develops an ontology of literature in favor of anti-intentionalism (1981b, 1982). Reviving Plato’s imitation theory of art, Beardsley claims that fictional works are essentially imitations of illocutionary acts. Briefly put, illocutionary acts are performed by utterances in particular contexts. For example, when a detective, convinced that someone is the killer, points his finger at that person and utters the sentence “you did it,” the detective is performing the illocutionary act of accusing someone. What illocutionary act is being performed is traditionally construed as jointly determined by the speaker’s intention to perform that act, the words uttered, and the relevant conditions in that particular context. Other examples of illocutionary acts include asserting, warning, castigating, asking, and the like.

Literary works can be seen as utterances; that is, texts used in a particular context to perform different illocutionary acts by authors. However, Beardsley claims that in the case of fictional works in particular, the purported illocutionary force will always be removed so as to make the utterance an imitation of that illocutionary act. When an attempted act is insufficiently performed, it ends up being represented or imitated. For example, if I say “please pass me the salt” in my dining room when no one except me is there, I end up representing (imitating) the illocutionary act of requesting because there is no uptake from the intended audience. Since the illocutionary act in this case is only imitated, it qualifies as a fictional act. This is why Beardsley sees fiction as representation.

Consider the uptake condition in the case of fictional works. Such works are not addressed to the audience as a talk is: there is no concrete context in which the audience can be readily identified. The uttered text hence loses its illocutionary force and ends up being a representation. Aside from this “address without access,” another obtaining condition for a fictional illocutionary act is the existence of non-referring names and descriptions in a fictional work. If an author writes a poem in which she greets the great detective Sherlock Holmes, this greeting will never obtain, because the name Sherlock Holmes does not refer to any existing person in the world. The greeting will only end up being a representation or a fictional illocution. By parity of reasoning, fictional works end up being representations of illocutionary acts in that they always contain names or descriptions involving events that never take place.

Now we must ask: by what criterion do we determine what illocutionary act is represented? It cannot be the speaker or author’s intention, because even if a speaker intends to represent a particular illocutionary act, she might end up representing another. Since the possibility of failed intention always exists, intention would not be an appropriate criterion. Convention is again invoked to determine the correct illocutionary act being represented. It is true that any practice of representing is intentional at the start in the sense that what is represented is determined by the representer’s intention. Nevertheless, once the connection between a symbol and what it is used to represent is established, intention is said to be detached from that connection, and deciding the content of a representation becomes a sheer matter of convention.

Since a fictional work is essentially a representation of an insufficiently performed illocutionary act, determining what it represents does not require us to go beyond that incomplete performance, just as determining what a mime is imitating does not require the audience to consider anything outside her performance, such as her intention. What the mime is imitating is completely determined by how we conventionally construe the act being performed. In a similar fashion, when considering what illocutionary act is represented by a fictional work, the interpreter should rely on internal evidence rather than on external evidence of authorial intent to construct the illocutionary act being represented. If, based on internal data, a story reads like a castigation of war, it is suitably seen as a representation of that illocutionary act. The conclusion is that the author’s intention plays no role in fixing the content of a fictional work.

Lastly, it is worth mentioning that Beardsley’s attitude toward nonfictional works is ambivalent. Obviously, his speech act argument applies to fictional works only, and he accepts that nonfictional works can be genuine illocutions. This category of works tends to have a more identifiable audience, who is hence not addressed without access. With illocutions, Beardsley continues to argue for an anti-intentionalist view of meaning according to which the utterer’s intention does not determine meaning. But his accepting nonfictional works as illocutions opens the door to considerations of external or contextual factors that go against his earlier stance, which is globally anti-intentionalist.

c. Notable Objections and Replies

One immediate concern with anti-intentionalism is whether convention alone can point to a single meaning (Hirsch, 1967). The common reason why people debate about interpretation is precisely that the work itself does not offer sufficient evidence to disambiguate meaning. Very often a work can sustain multiple meanings and the problem of choice prompts some people to appeal to the artist’s intention. It does not seem plausible to say that one can assign only a single meaning to works like Ulysses or Picasso’s abstract paintings if one concentrates solely on internal evidence. To this objection, Beardsley (1970) insists that, in most cases, appeal to the coherence of the work can eventually leave us with a single correct interpretation.

A second serious objection to anti-intentionalism is the case of irony (Hirsch, 1976, pp. 24–5). It seems reasonable to say that whether a work is ironic depends on if its creator intended it to be so. For instance, based on internal evidence, many people took Daniel Defoe’s pamphlet The Shortest Way with the Dissenters to be genuinely against the Dissenters upon its publication. However, the only ground for saying that the pamphlet is ironic seems to be Defoe’s intention. If irony is a crucial component of the work, ignoring it would fail to respect the work’s identity. It follows that irony cannot be grounded in internal evidence alone. Beardsley’s reply (1982, pp. 203–7) is that irony must offer the possibility of understanding. If the artist cannot imagine anyone taking it ironically, there would be no reason to believe the work to be ironic.

However, the problem of irony is only part of a bigger concern that challenges the irrelevance of external factors to interpretation. Many factors present at the time of the work’s creation seem to play a key role in shaping a work’s identity and content. Missing out on these factors would lead us to misidentifying the work (and hence to misinterpreting it).

For instance, a work will not be seen as revolutionary unless the interpreter knows something about the contemporaneous artistic tradition: ignoring the work’s innovation amounts to accepting that the work can lose its revolutionary character while remaining self-identical. If we see this character as identity-relevant, we should then take it into consideration in our interpretation. The same line of thinking goes for other identity-conferring contextual factors, such as the social-historical conditions and the relations the work bears to contemporaneous or prior works. The present view is thus called ontological contextualism to foreground the ontological claim that the identity and content of a work of art are in part determined by the relations it bears to its context of production.

Contextualism leads to an important distinction between work and text in the case of literature. In a nutshell: a text is not context-dependent but a work is. The anti-intentionalist stance thus leads the interpreter to consider texts rather than works because it rejects considerations of external or contextual factors. The same distinction goes for other art forms when we draw a comparison between an artistic production considered in its brute form and in its context of creation. For convenience, the word “work” is used throughout with notes on whether contextualism is taken or not.

As a reply to the contextualist objection, it has been argued (Davies, 2005) that Beardsley’s position allows for contextualism. If this is convincing, the contextualist criticism of anti-intentionalism would not be conclusive.

3. Value-Maximizing Theory

a. Overview

The value-maximizing theory can be viewed as being derived from anti-intentionalism. Its core claim is that the primary aim of art interpretation is to offer interpretations that maximize the value of a work. There are at least two versions of the maximizing position distinguished by the commitment to contextualism. When the maximizing position is committed to contextualism, the constraint on interpretation will be convention plus context (Davies, 2007); otherwise, the constraint will be convention only, as endorsed by anti-intentionalism (Goldman, 2013).

As indicated, the word “maximize” does not imply monism. That is, the present position does not claim that there can be only a single way to maximize the value of a work of art. On the contrary, it seems reasonable to assume that in most cases the interpreter can envisage several readings to bring out the value of the work. For example, Kafka’s Metamorphosis has generated a number of rewarding interpretations, and it is difficult to argue for a single best among them. As long as an interpretation is revealing or insightful under the relevant interpretative constraints, we may count it as value-maximizing. Such being the case, the value-maximizing theory may be relabelled the “value-enhancing” or “value-satisfying” theory.

Given this pluralist picture, the maximizer, unlike the anti-intentionalist, will need to accept the indeterminacy thesis that convention (and context, if she endorses contextualism) alone does not guarantee the unambiguity of the work. This allows the maximizing position to bypass the challenge posed by said thesis, rendering it a more flexible position than anti-intentionalism in regard to the number of legitimate interpretations.

Encapsulating the maximizing position in a few words: it holds that the primary aim of art interpretation is to enhance appreciative satisfaction by identifying interpretations that bring out the value of a work within reasonable limits set by convention (and context).

b. Notable Objections and Replies

The actual intentionalist will maintain that figurative features such as irony and allusion must be analysed intentionalistically. The maximizer with contextualist commitment can counter this objection by dealing with intentions more sophisticatedly. If the relevant features are identity conferring, they will be respected and accepted in interpretation. In this case, any interpretation that ignores the intended feature ends up misidentifying the work. But if the relevant features are not identity conferring, more room will be left for the interpreter to consider them. The intended feature can be ignored if it does not add to the value of the work. By contrast, where such a feature is not intended but can be put in the work, the interpreter can still build it into the interpretation if it is value enhancing.

The most important objection to the maximizing view has it that the present position is in danger of turning a mediocre work into a masterpiece. Ed Wood’s film Plan 9 from Outer Space is the most discussed example. Many people consider this work to be the worst film ever made. However, interpreted from a postmodern perspective as satire—which is presumably a value-enhancing interpretation—would turn it into a classic.

The maximizer with contextualist leanings can reply that the postmodern reading fails to identify the film as authored by Wood (Davies, 2007, p, 187). Postmodern views were not available in Wood’s time, so it was impossible for the film to be created as such. Identifying the film as postmodernist amounts to anachronism that disrespects the work’s identity. The moral of this example is that the maximizer does not blindly enhance the value of a work. Rather, the work to be interpreted needs to be contextualized first to ensure that subsequent attributions of aesthetic value are done in light of the true and fair presentation of the work.

4. Actual Intentionalism

Contra anti-intentionalism, actual intentionalism maintains that the artist’s intention is relevant to interpretation. The position comes in at least three forms, giving different weights to intention. The absolute version claims that work-meaning is fully determined by the artist’s intention; the extreme version claims that the work ends up being meaningless when the artist’s intention is incompatible with it; and the moderate version claims that either the artist’s intention determines meaning or—if this fails—meaning is determined instead by convention (and context, if contextualism is endorsed).

a. Absolute Version

Absolute actual intentionalism claims that a work means whatever its creator intends it to mean. Put otherwise, it sees the artist’s intention as the necessary and sufficient condition for a work’s meaning. This position is often dubbed Humpty-Dumptyism with reference to the character Humpty-Dumpty in Through the Looking-Glass. This character tries to convince Alice that he can make a word mean what he chooses it to mean. This unsettling conclusion is supported by the argument about intentionless meaning: a mark (or a sequence of marks) cannot have meaning unless it is produced by an agent capable of intentional activities; therefore, meaning is identical to intention.

It seems plausible to abandon the thought that marks on the sand are a poem once we know they were caused by accident. But this at best proves that intention is the necessary condition for something’s being meaningful; it does not prove further that what something means is what the agent intended it to mean. In other words, the argument about intentionless meaning does a better job in showing that intention is an indispensable ingredient for meaningfulness than in showing that intention infallibly determines the meaning conveyed.

b. Extreme Version

To avoid Humpty-Dumptyism, the extreme actual intentionalist rejects the view that the artist’s intention infallibly determines work-meaning and accepts the indeterminacy thesis that convention alone does not guarantee a single evident meaning to be found in a work. The extreme intentionalist claims further that the meaning of the work is fixed by the artist’s intention if her intention identifies one of the possible meanings sustained by the work; otherwise, the work ends up being meaningless (Hirsch, 1967). Better put, the extreme intentionalist sees intention as the necessary rather than sufficient condition for work-meaning.

Aside from the unsatisfactory result that a work becomes meaningless when the artist’s intention fails, the present position faces a dilemma when dealing with the case of figurative language (Nathan, in Iseminger (1992)). Take irony for example. The first horn of the dilemma is as follows: Constrained by linguistic conventions, the range of possible meanings has to include the negation of the literal meaning in order for the intended irony to be effective. But this results in absolute intentionalism: every expression would be ironic as long as the author intends it to be. But—this is the second horn—if the range of possible meanings does not include the negation of literal meaning, the expression simply becomes meaningless in that there is no appropriate meaning possible for the author to actualize. It seems that a broader notion of convention is needed to explain figurative language. But if the extreme intentionalist makes that move, her intentionalist position will be undermined, for the author’s intention would be given a less important role than convention in such cases. However, this problem does not arise when the actual intentionalist is committed to contextualism, for in that case the contextual factors that make the intended irony possible will be taken into account.

c. Moderate Version

Though there are several different versions of moderate actual intentionalism, they share the common ground that when the artist’s intention fails, meaning is fixed instead by convention and context. (Whether all moderate actual intentionalists take context into account is controversial and this article will not dig into this controversy for reasons of space.) That is, when the artist’s intention is successful, it determines meaning; otherwise, meaning is determined by convention plus context (Carroll, 2001; Stecker, 2003; Livingston, 2005).

As seen, an intention is successful so long as it identifies one of the possible meanings sustained by the work even if the meaning identified is less plausible than other candidates. But what exactly is the interpreter doing when she identifies that meaning? It is reasonable to say that the interpreter does not need to ascertain all the possible meanings and see if there is a fit. Rather, all she needs to do is to see whether the intended meaning can be read in accordance with the work. This is why the moderate intentionalist puts the success condition in terms of compatibility: an intention is successful so long as the intended meaning is compatible with the work. The fact that a certain meaning is compatible with the work means that the work can sustain it as one of its possible meanings.

Unfortunately, the notion of compatibility seems to allow strange cases in which an insignificant intention can determine work-meaning as long as it is not explicitly rejected by the relevant interpretative constraint. For example, if Agatha Christie reveals that Hercule Poirot is actually a smart Martian in disguise, the moderate intentionalist would need to accept it because this proclamation of intention can still be said to be compatible with the text in the sense that it is not rejected by textual evidence. To avoid this bad result, compatibility needs to be qualified.

The moderate intentionalist then analyses compatibility in terms of the meshing condition, which refers to a sufficient degree of coherence between the content of the intention and the work’s rhetorical patterns. An intention is compatible with the work in the sense that it meshes well with the work. The Martian case will hence be ruled out by the meshing condition because it does not engage sufficiently with the narrative even if it is not explicitly rejected by textual evidence. The meshing condition is a minimal or weak success condition in that it does not require the intention to mesh with every textual feature. A sufficient amount will do, though the moderate intentionalist admits that the line is not always easy to draw. With this weak standard for success, it can happen that the interpreter is not able to discern the intended meaning in the work before she learns of the artist’s intention.

There is a second kind of success condition which adopts a stronger standard (Stecker, 2003; Davies, 2007, pp. 170–1). This standard for success states that an intention is successful just in case the intended meaning, among the possible meanings sustained by the work, is the one most likely to secure uptake from a well-backgrounded audience (with contextual knowledge and all). For example, if a work of art, within the limits set by convention and context, affords interpretations x, y, and z, and x is more readily discerned than the other two by the appropriate audience, then x is the meaning of the work.

These accounts of the success condition answer a notable objection to moderate intentionalism. This objection claims that moderate intentionalism faces an epistemic dilemma (Trivedi, 2001). Consider an epistemic question: how do we know whether an intention is successfully realized? Presumably, we figure out work-meaning and the artist’s intention respectively and independently of each other. And then we compare the two to see if there is a fit. Nevertheless, this move is redundant: if we can figure out work-meaning independently of actual intention, why do we need the latter? And if work-meaning cannot be independently obtained, how can we know it is a case where intentions are successfully realized and not a case where intentions failed? It follows that appeal to successful intention results in redundancy or indeterminacy.

The first horn of the dilemma assumes that work-meaning can be obtained independently of knowledge of successful intention, but this is false for moderate intentionalists, for they acknowledge that in many cases the work presents ambiguity that cannot be resolved solely in virtue of internal evidence. The moderate intentionalist rejects the second horn by claiming that they do not determine the success of an intention by comparing independently obtained work-meaning with the artist’s intention (Stecker, 2010, pp. 154–5). As already discussed, moderate intentionalists propose different success conditions that do not appeal to the identity between the artist’s intention and work-meaning. Moderate intentionalists adopting the weak standard hold that success is defined by the degree of meshing; those who adopt the strong standard maintain that success is defined by the audience’s ability to grasp the intention. Neither requires the interpreter to identify a work’s meaning independently of the artist’s intention.

d. Objections to Actual Intentionalism

The most commonly raised objection is the epistemic worry, which asks: is intention knowable? It seems impossible for one to really know others’ mental states, and the epistemic gap in this respect is thus unbridgeable. Actual intentionalists tend to dismiss this worry as insignificant and maintain that in many contexts (daily conversation or historical investigations) we have no difficulty in discerning another person’s intention (Carroll, 2009, pp. 71–5). In that case, why would things suddenly stand differently when it comes to art interpretation? This is not to say that we succeed on every occasion of interpretation, but that we do so in an amazingly large number of cases. That being said, we should not reject the appeal to intention solely because of the occasional failure.

Another objection is the publicity paradox (Nathan, 2006). The main idea is this: when someone S conveys something p by a production of an object O for public consumption, there is a second-order intention that the audience need not go beyond O to reach p; that is, there is no need to consult S’s first-order intentions to understand O. Therefore, when an artist creates a work for public consumption, there is a second-order intention that her first-order intentions not be consulted, otherwise it would indicate the failure of the artist. Actual intentionalism hence leads to the paradoxical claim that we should and should not consult the artist’s intentions.

The actual intentionalist’s response (Stecker, 2010, pp. 153–4) is this: not all artists have the second-order intention in question. If this premise is false, then the publicity argument becomes unsound. Even if it were true, the argument would still be invalid, because it confuses the intention that the artist intends to create something standing alone with the intention that her first-order intention need not be consulted. The paradox will not hold if this distinction is made.

Lastly, many criticisms are directed at a popular argument among actual intentionalists: the conversation argument (Carroll, 2001; Jannotta, 2014). An analogy between conversation and art interpretation is drawn, and actual intentionalists claim that if we accept that art interpretation is a form of conversation, we need to accept actual intentionalism as the right prescriptive account of interpretation, because the standard goal of an interlocutor in a conversation is to grasp what the speaker intends to say. (This is a premise even anti-intentionalists accept, but they apparently reject the further claim that art interpretation is conversational. See Beardsley, 1970, ch.1.) This analogy has been severely criticized (Dickie, 2006; Nathan, 2006; Huddleston, 2012). The greatest disanalogy between conversation and art is that the latter is more like a monologue delivered by the artist rather than an interchange of ideas.

One way to meet the monologue objection is to specify more clearly the role of the conversational interest. In fact, the actual intentionalist claims that the conversational interest should constrain other interests such as the aesthetic interest. In other words, other interests can be reconciled or work with the conversational interest. Take the case of the hermeneutics of suspicion for example. Hermeneutics of suspicion is a skeptical attitude—often heavily politicized—adopted toward the explicit stance of a work. Interpretations based on the hermeneutics of suspicion have to be constrained by the artist’s non-ironic intention in order for them to count as legitimate interpretations. For instance, in attributing racist tendencies to Jules Verne’s Mysterious Island, in which the black slave Neb is portrayed as docile and superstitious, we need to suppose that the tendencies are not ironic; otherwise, the suspicious reading becomes inappropriate. In this example, the artistic conversation does not end up being a monologue, for the suspicious hermeneut listens and understands Verne before responding with the suspicious reading, which is constrained by the conversational interest. A conversational interchange is hence completed.

5. Hypothetical Intentionalism

a. Overview

A compromise between actual intentionalism and anti-intentionalism is hypothetical intentionalism, the core claim of which is that the correct meaning of a work is determined by the best hypothesis about the artist’s intention made by a selected audience. The aim of interpretation is then to hypothesize what the artist intended when creating the work from the perspective of the qualified audience (Tolhurst, 1979; Levinson, 1996).

Two points call for attention. First, it is hypothesis—not truth—that matters. This means that a hypothesis of the actual intention will never be trumped by knowledge of that very intention. Second, the membership of the audience is crucial because it determines the kind of evidence legitimate for the interpreter to use.

A 1979 proposal (Tolhurst) suggests that the relevant audience be singled out by the artist’s intention, that is, the audience intended to be addressed by the artist. Work-meaning is thus determined by the intended audience’s best hypothesis about the artist’s intention. This means that the interpreter will need to equip herself with the relevant beliefs and background knowledge of the intended audience in order to make the best hypothesis. Put another way, hypothetical intentionalism focuses on the audience’s uptake of an utterance addressed to them. This being so, what the audience relies on in comprehending the utterance will be based on what she knows about the utterer on that particular occasion. Following this contextualist line of thinking, the meaning of Jonathan Swift’s A Modest Proposal will not be the suggestion that the poor in Ireland might ease their economic pressure by selling their children as food to the rich; rather, given the background knowledge of Swift’s intended audience, the best hypothesis about the author’s intention is that he intended the work to be a satire that criticizes the heartless attitude toward the poor and Irish policy in general.

However, there is a serious problem with the notion of an intended audience. If the intended audience is an extremely small group possessing esoteric knowledge of the artist, meaning becomes a private matter, for the work can only be properly understood in terms of private information shared between artist and audience, and this results in something close to Humpty-Dumptyism, which is characteristic of absolute intentionalism.

To cope with this problem, the hypothetical intentionalist replaces the concept of an intended audience with that of an ideal or appropriate audience. Such an audience is not necessarily targeted by the artist’s intention and is ideal in the sense that its members are familiar with the public facts about the artist and her work. In other words, the ideal audience seeks to anchor the work in its context of creation based on public evidence. This avoids the danger of interpreting the work on the basis of private evidence.

The hypothetical intentionalist is aware that in some cases there will be competing interpretations which are equally good. An aesthetic criterion is then introduced to adjudicate between these hypotheses. The aesthetic consideration comes as a tie breaker: when we reach two or more epistemically best hypotheses, the one that makes the work artistically better should win.

Another notable distinction introduced by hypothetical intentionalism is that between semantic and categorial intention (Levinson, 1996, pp. 188–9). The kind of intention we have been discussing is semantic: it is the intention by which an artist conveys her message in the work. By contrast, categorial intention is the artist’s intention to categorize her production, either as a work of art, a certain artform (such as Romantic literature), or a particular genre (such as lyric poetry). Categorial intention indirectly affects a work’s semantic content because it determines how the interpreter conceptualizes the work at the fundamental level. For instance, if a text is taken as a grocery list rather than an experimental story, we will interpret it as saying nothing beyond the named grocery items. For this reason, the artist’s categorial intention should be treated as among the contextual factors relevant to her work’s identity. This move is often adopted by theorists endorsing contextualism, such as maximizers or moderate intentionalists.

b. Notable Objections and Replies

Hypothetical intentionalism has received many criticisms and challenges that merit mention. A frequently expressed worry is that it seems odd to stick to a hypothesis when newly found evidence proves it to be false (Carroll, 2001, pp. 208–9). If an artist’s private diary is located and reveals that our best hypothesis about her intention regarding her work is false, why should we cling to that hypothesis if the newly revealed intention meshes well with the work? Hypothetical intentionalism implausibly implies that warranted assertibility constitutes truth.

The hypothetical intentionalist clarifies her position (Levinson, 2006, p. 308) by saying that warranted assertibility does not constitute the truth for the utterer’s meaning, but it does constitute the truth for utterance meaning. The ideal audience’s best hypothesis constitutes utterance meaning even if it is designed to infer the utterer’s meaning.

Another troublesome objection states that hypothetical intentionalism collapses into the value-maximizing theory, for, when making the best hypothesis of what the artist intended, the interpreter inevitably attributes to the artist the intention to produce a piece with the highest degree of aesthetic value that the work can sustain (Davies, 2007, pp. 183–84). That is, the epistemic criterion for determining the best hypothesis is inseparable from the aesthetic criterion.

In reply, it is claimed that this objection may stem from the impression that an artist normally aims for the best; however, this does not imply that she would anticipate and intend the artistically best reading of the work. It follows that it is not necessary that the best reading be what the artist most likely intended even if she could have intended it. The objector replies that, still, the situation in which we have two epistemically plausible readings while one is inferior cannot arise, because we would adopt the inferior reading only when the superior reading is falsified by evidence.

The third objection is that the distinction between public and private evidence is blurry (Carroll, 2001, p. 212). Is public evidence published evidence? Does published information from private sources count as public? The reply from the hypothetical intentionalist emphasizes that this is not a distinction between published and unpublished information (Levinson, 2006, p. 310). The relevant public context should be reconstrued as what the artist appears to have wanted the audience to know about the circumstances of the work’s creation. This means that if it appears that the artist did not want to make certain proclamations of intent known to the audience, then this evidence, even if published at a later point, does not constitute the public context to be considered for interpretation.

Finally, two notable counterexamples to hypothetical intentionalism have been proposed (Stecker, 2010, pp. 159–60). The first counterexample is that W means p but p is not intended by the artist and the audience is justified in believing that p is not intended. In this case hypothetical intentionalism falsely implies that W does not mean p. For example, it is famously known among readers of Sherlock Holmes adventures that Dr. Watson’s war wound appears in two different locations. On one occasion the wound is said to be on his arm, while on another it is on his thigh. In other words the Holmes story fictionally asserts impossibility regarding Watson’s wound. But given the realistic style of the Holmes adventures, the best hypothesis of authorial intent in this case would deny that the impossibility is part of the meaning of the story, which is apparently false.

However, the hypothetical intentionalist would not maintain that W means p, because p is not the best hypothesis. She would not claim that the Holmes story fictionally asserts impossibility regarding Watson’s wound, for the best hypothesis made by the ideal reader would be that Watson has the wound somewhere on his body—his arm or thigh, but exactly where we do not know. It is a mistake to presuppose that W means p without following the strictures imposed by hypothetical intentionalism to properly reach p.

The second counterexample to hypothetical intentionalism is the case where the audience is justified in believing that p is intended by the artist but in fact W means q; the audience would then falsely conclude that W means p. Again, what W means is determined by the ideal audience’s best hypothesis based on convention and context, not by what the work literally asserts. The meaning of the work is the product of a prudent assessment of the total evidence available.

6. Hypothetical Intentionalism and the Hypothetical Artist

a. Overview

There is a second variety of hypothetical intentionalism that is based on the concept of a hypothetical artist. Generally speaking, it maintains that interpretation is grounded on the intention suitably attributed by the interpreter to a hypothetical or imagined artist. This version of hypothetical intentionalism is sometimes called fictionalist intentionalism or postulated authorism. The theoretical apparatus of a hypothetical artist can be traced back to Wayne Booth’s account of the “implied author,” in which he suggests that the critic should focus on the author we can make out from the work instead of on the historical author, because there is often a gap between the two.

Though proponents of the present brand of intentionalism disagree on the number of acceptable interpretations and on what kind of evidence is legitimate, they agree that the interpreter ought to concentrate on the appearance of the work. If it appears, based on internal evidence (and perhaps contextual information if contextualism is endorsed), that the artist intends the work to mean p, then p is the right interpretation of the work. The artist in question is not the historical artist; rather, it is an artist postulated by the audience to be responsible for the intention made out from, or implied by, the work. For example, if there is an anti-war attitude detected in the work, the intention to castigate war should be attributed to the postulated artist, not to the historical artist. The motivation behind this move is to maintain work-centered interpretation but avoid the fallacious reasoning that whatever we find in the work is intended by the real artist.

Inheriting the spirit of hypothetical intentionalism, fictionalist intentionalism aims to make interpretation work-based but author-related at the same time. The biggest difference between the two stances is that, as said, fictionalist intentionalism does not appeal to the actual or real artist, thereby avoiding any criticisms arising from hypothesizing about the real artist such as that the best hypothesis about the real artist’s intention should be abandoned when compelling evidence against it is obtained.

b. Notable Objections and Replies

The first concern with fictionalist intentionalism is that constructing a historical variant of the actual artist sounds suspiciously like hypothesizing about her (Stecker, 1987). But there is still a difference. “Hypothesizing about the actual artist,” or more accurately, “hypothesizing the actual artist’s intention,” would be a characterization of hypothetical intentionalism rather than fictionalist intentionalism. The latter does not track the actual artist’s intention but constructs a virtual one. As shown, fictionalist intentionalism, unlike hypothetical intentionalism, is immune to any criticisms resulting from ignoring the actual artist’s proclamation of her intention.

A second objection criticizes fictionalist intentionalism for not being able to distinguish between different histories of creative processes for the same textual appearance (Livingston, 2005, pp. 165–69). For example, suppose a work that appears to be produced with a well-conceived scheme did result from that kind of scheme; suppose further that a second work that appears the same actually emerged from an uncontrolled process. Then, if we follow the strictures of fictionalist intentionalism, the interpretations we produce for these two works would turn out to be the same, for based on the same appearance the hypothetical artists we construct in both cases would be identical. But these two works have different creative histories and the difference in question seems too crucial to be ignored.

The objection here fails to consider the subtlety of reality-dependent appearances (Walton, 2008, ch. 12). For example, suppose the exhibit note beside a painting tells us it was created when the painter got heavily drunk. Any well-organized feature in the work that appears to result from careful manipulation by the painter might now either look disordered or structured in an eerie way depending on the feature’s actual presentation. Compare this scenario to another where a (almost) visually indistinguishable counterpart is exhibited in the museum with the exhibit note revealing that the painter spent a long period crafting the work. In this second case the audience’s perception of the work is not very likely to be the same as that in the first case. This shows how the apparent artist account can still discriminate between (appearances of) different creative histories of the same artistic presentation.

Finally, there is often the qualm that fictionalist intentionalism ends up postulating phantom entities (hypothetical creators) and phantom actions (their intendings). The fictional intentionalist can reply that she is giving descriptions only of appearances instead of quantifying over hypothetical artists or their actions.

7. Conclusion

From the above discussion we can notice two major trends in the debate. First, most late 20th century and 21st century participants are committed to the contextualist ontology of art. The relevance of art’s historical context, since its first philosophical appearance in Arthur Danto’s 1964 essay “The Artworld,” continues to influence analytic theories of art interpretation. There is no sign of this trend diminishing. In Noël Carroll’s 2016 survey article on interpretation, the contextualist basis is still assumed.

Second, actual intentionalism remains the most popular position among all. Many substantial monographs have been written in this century to defend the position (Stecker, 2003; Livingston, 2005; Carroll, 2009; Stock 2017). This intentionalist prevalence probably results from the influence of H. P. Grice’s work on the philosophy of language. And again, this trend, like the contextualist vogue, is still ongoing. And if we see intentionalism as an umbrella term that encompasses not only actual intentionalism but also hypothetical intentionalism and probably fictionalist intentionalism, the influence of intentionalism and its related emphasis on the concept of an artist or author will be even stronger. This presents an interesting contrast with the trend in post-structuralism that tends to downplay authorial presence in theories of interpretation, as embodied in the author-is-dead thesis championed by Barthes and Foucoult (Lamarque, 2009, pp. 104–15).

8. References and Further Reading

  • Beardsley, M. C. (1970). The possibility of criticism. Detroit, MI: Wayne State University Press.
  • Contains four philosophical essays on literary criticism. The first two are among Beardsley’s most important contributions to the philsoophy of interpretation.

  • Beardsley, M. C. (1981a). Aesthetics: Problems in the philosophy of criticism (2nd ed.). Indianapolis, IN: Hackett.
  • A comprehensive volume on philosophical issues across the arts and also a powerful statement of anti-intentionalism.

  • Beardsley, M. C. (1981b). Fiction as representation. Synthese, 46, 291–313.
  • Presents the speech act theory of literature.

  • Beardsley, M. C. (1982). The aesthetic point of view: Selected essays. Ithaca, NY: Cornell University Press.
  • Contains the essay “Intentions and Interpretations: A Fallacy Revived,” in which Beardsley applies his speech act theory to the interpretation of fictional works.

  • Booth, W. C. (1983). The rhetoric of fiction (2nd ed.). Chicago, IL: University of Chicago Press.
  • Contains the original account of the implied author.

  • Carroll, N. (2001). Beyond aesthetics: Philosophical essays. New York, NY: Cambridge University Press.
  • Contains in particular Carroll’s conversation argument, discussion on the hermenutics of suspicion, defense of moderate intentionalism, and criticism of hypothetical intentionalism.

  • Carroll, N. (2009). On criticism. New York, NY: Routledge.
  • An engaging book on artistic evaluation and interpretation.

  • Carroll, N., & Gibson, J. (Eds.). (2016). The Routledge companion to philosophy of literature. New York, NY: Routledge.
  • Anthologizes Carroll’s survey article on the intention debate.

  • Currie, G. (1990). The nature of fiction. Cambridge, England: Cambridge University Press.
  • Contains a defense of fictionalist intentionalism.

  • Currie, G. (1991). Work and text. Mind, 100, 325–40.
  • Presents how a commitment to contextualism leads to an important distinction between work and text in the case of literature.

  • Danto, A. C. (1964). The artworld. Journal of Philosophy, 61, 571–84.
  • First paper to draw attention to the relevance of a work’s context of production.

  • Davies, S. (2005). Beardsley and the autonomy of the work of art. Journal of Aesthetics and Art Criticism, 63, 179–83.
  • Argues that Beardsley is actually a contextualist.

  • Davies, S. (2007). Philosophical perspectives on art. Oxford, England: Oxford University Press.
  • Part II contains Davies’ defense of the maximizing position and criticisms of other positions.

  • Dickie, G. (2006). Intentions: Conversations and art. British Journal of Aesthetics, 46, 71–81.
  • Criticizes Carroll’s conversation argument and actual intentionalism.

  • Goldman, A. H. (2013). Philosophy and the novel. Oxford, England: Oxford University Press.
  • Contains a defense of the value-maximizing theory without a contextualist commitment.

  • Hirsch, E. D. (1967). Validity in interpretation. New Haven, CT: Yale University Press.
  • The most representative presentation of extreme intentionalism.

  • Hirsch, E. D. (1976). The aims of interpretation. Chicago, IL: University of Chicago Press.
  • Contains a collection of essays expanding Hirsh’s views on interpretation.

  • Huddleston, A. (2012). The conversation argument for actual intentionalism. British Journal of Aesthetics, 52, 241–56.
  • A brilliant criticism of Carroll’s conversation argument.

  • Iseminger, G. (Ed.). (1992). Intention & interpretation. Philadelphia, PA: Temple University Press.
  • A valuable collection of essays featuring Beardsley’s account of the work’s autonomy, Knapp and Michaels’ absolute intentionalism, Iseminger’s extreme intentionalism, Nathan’s account of the postulated artist, Levinson’s hypothetical intentionalism, and eight other contributions.

  • Jannotta, A. (2014). Interpretation and conversation: A response to Huddleston. British Journal of Aesthetics, 54, 371–80.
  • A defense of the conversation argument.

  • Krausz, M. (Ed.). (2002). Is there a single right interpretation? University Park: Pennsylvania State University Press.
  • Another valuable anthology on the intention debate, containing in particular Carroll’s defense of moderate intentionalism, Lamarque’s criticism of viewing work-meaning as utterance meaning.

  • Lamarque, P. (2009). The philosophy of literature. Malden, MA: Blackwell.
  • The third and the fourth chapters discuss analytic theories of interpretation along with a critical assessment of the author-is-dead claim.

  • Levinson, J. (1996). The pleasure of aesthetics: Philosophical essays. Ithaca, NY: Cornell University Press.
  • The tenth chapter is Levinson’s revised presentation of hypothetical intentionalism and the distinction between semantic and categorial intention.

  • Levinson, J. (2006). Contemplating art: Essays in aesthetics. Oxford, England: Oxford University Press.
  • Contains Levinson’s replies to major objections to hypothetical intentionalism.

  • Levinson, J. (2016). Aesthetic pursuits: Essays in philosophy of art. Oxford, England: Oxford University Press.
  • Contains Levinson’s updated defense of hypothetical intentionalism and criticism of Livingston’s moderate intentionalism.

  • Livingston, P. (2005). Art and intention: A philosophical study. Oxford, England: Oxford University Press.
  • A thorough discussion on intention, literary ontology, and the problem of interpretation, with emphases on defending the meshing condition and on the criticisms of the two versions of hypothetical intentionalism.

  • Nathan, D. O. (1982). Irony and the artist’s intentions. Journal of Aesthetics and Art Criticism, 22, 245–56.
  • Criticizes the notion of an intended audience.

  • Nathan, D. O. (2006). Art, meaning, and artist’s meaning. In M. Kieran (Ed.), Contemporary debates in aesthetics and the philosophy of art (pp. 282–93). Oxford, England: Blackwell.
  • Presents an account of fictionalist intentionalism, a critique of the conversation argument, and a brief recapitulation of the publicity paradox.

  • Nehamas, A. (1981). The postulated author: Critical monism as a regulative ideal. Critical Inquiry, 8, 133–49.
  • Presents another version of fictionalist intentionalism.

  • Stecker, R. (1987). ‘Apparent, Implied, and Postulated Authors’, Philosophy and Literature 11, pp 258-71.
  • Criticizes different versions of fictionalist intentionalism

  • Stecker, R. (2003). Interpretation and construction: Art, speech, and the law. Oxford, England: Blackwell.
  • A valuable monograph devoted to the intention debate and its related problems such as the ontology of art, incompatible interpretations and the application of theories of art interpretation to law. The book defends moderate intentionalism in particular.

  • Stecker, R. (2010). Aesthetics and the philosophy of art: An introduction. Lanham, MD: Rowman & Littlefield.
  • Contains a chapter that presents the disjunctive formulation of moderate intentionalism and the two counterexamples to hypothetical intentionalism.

  • Stecker, R., & Davies, S. (2010). The hypothetical intentionalist’s dilemma: A reply to Levinson. British Journal of Aesthetics, 50, 307–12.
  • Counterreplies to Levinson’s replies to criticisms of hypothetical intentionalism.

  • Stock, K. (2017). Only imagine: Fiction, interpretation, and imagination. Oxford, England: Oxford University Press.
  • Contains a defense of absolute (the author uses the term “extreme”) intentionalism.

  • Tolhurst, W. E. (1979). On what a text is and how it means. British Journal of Aesthetics, 19, 3–14.
  • The founding document of hypothetical intentionalism.

  • Trivedi, S. (2001). An epistemic dilemma for actual intentionalism. British Journal of Aesthetics, 41, pp. 192–206.
  • Presents an epistemic dilemma for actual intentionalism and defense of hypothetical intentionalism.

  • Walton, K. L. (2008). Marvelous images: On values and the arts. Oxford, England: Oxford University Press.
  • A collection of essays, including “Categories of Art,” which might have inspired Levinson’s conception of categorial intention; and “Style and the Products and Processes of Art,” which is a defense of fictionalist intentionalism in terms of the notion “apparent artist.”

  • Wimsatt, W. K., & Beardsley, M. C. (1946). The intentional fallacy. The Sewanee Review, 54, 468–88.
  • The first thorough presentation of anti-intentionalism, commonly regarded as starting point of the intention debate.

 

Author Information

Szu-Yen Lin
Email: lsy17@ulive.pccu.edu.tw
Chinese Culture University
Taiwan

Plotinus: Virtue Ethics

This article focuses on the virtue ethics of Plotinus (204—270 C.E.) and its implications for later accounts of virtue ethics, particularly in Porphyry and Iamblichus. Plotinus’ ethical theory is discussed in relation to the aim of the virtuous person to become godlike, the role of disposition in the soul’s intellectualization, the four cardinal virtues, well-being, human freedom, and self-determination. Plotinus’ virtue ethics is also presented in regards to his theory of transmigration and his criticism of the Gnostics.

Plotinus was a neo-Platonist, and Plato’s ethical teaching underlines Plotinus’ conception of virtue as an intrinsic quality of human character and also underlies Plotinus’ conception of excellence that derives from the soul’s purity in the contemplation of the Forms. Aristotle’s ethical theory influences Plotinus, particularly Aristotle’s recognition of the gods as purely intelligible beings, which are not possessing virtues. Even more importantly, Aristotle’s distinction between intellectual and ethical virtues was a great influence upon Plotinus.

Plotinus’ virtue ethics has been used by later Neoplatonists such as Porphyry, Iamblichus, Macrobius, and Olympiodorus. Plotinus’ treatment of virtues is also found in the ethical theories of Arabic Neoplatonists and in Neoplatonic commentaries on the Aristotelian ethics. Plotinus’ analysis of the four Platonic cardinal virtues has been systematically treated by Porphyry.

Table of Contents

  1. Philosophical Background and Reception
  2. Becoming like God
  3. Disposition and Intellectual Qualities
  4. The Cardinal Virtues
    1. Courage
    2. Self-control
    3. Justice
    4. Wisdom
  5. Well-being
  6. Human Freedom and Self-Determination
  7. Soul and Transmigration
  8. Criticism of the Gnostics
  9. Virtue Ethics after Plotinus
    1. Porphyry
    2. Iamblichus
  10. References and Further Reading
    1. Texts and Translations
    2. Commentaries
    3. Introductory Sources
    4. Ethical Theory
    5. Human Freedom and Selfhood
    6. Post-Plotinian Virtue Ethics

1. Philosophical Background and Reception

Plato‘s and Aristotle‘s virtue ethics are found in the background of Plotinus’ ethical theory. Plato’s ethical teaching—particularly in the Symposium, the Phaedo, the Pheadrus, and the Republic—underlines Plotinus‘ conception of virtue as an intrinsic quality of human character and his conception of excellence that derives from the soul’s purity in the contemplation of the Forms. Aristotle’s ethical theory (Nicomachean Ethics) influences Plotinus, particularly Aristotle’s recognition of the gods as purely intelligible beings, which are not possessing virtues (NE 1178), but even more importantly Aristotle’s distinction between intellectual and ethical virtues (NE 1139).

Plotinus’ virtue ethics, mainly exposed in Ennead I 2, has been used by later Neoplatonists such as Porphyry, Iamblichus, Macrobius, and Olympiodorus (O’ Meara 2003). Plotinus’ treatment of virtues is also found in the ethical theories of Arabic Neoplatonists and in Neoplatonic commentaries on the Aristotelian ethics (Smith 2004). Plotinus’ analysis of the four Platonic cardinal virtues in Ennead I 2 has been systematically treated by Porphyry in his Sententiae ad intelligibilia ducentes (section 32).

Porphyry discussed a fourfold scale of virtues in correspondence to the area where the virtues apply: (1) political virtues correspond to the practical and civic sphere, (2) purificatory virtues correspond to soul’s initial purification and ascent from the body, (3) theoretical virtues correspond to soul’s contemplation of the Forms, and (4) the paradigmatic, or exemplary virtues, correspond directly to the Forms and the divine Nous.

Porphyry, in his biographical work of Plotinus, On the Life of Plotinus and the Order of his Books, classifies the nine treatises of the first Ennead as his master’s work that is “mainly concerned with morals” (Life 24.17-18). Plotinus’ ethical theory is mainly discussed in Enneads I 2 [19] On Virtues, I 4 [46] On Well-Being, and Ennead I 3 [20] On Dialectic. Whereas Ennead I 2 offers an analysis of Plato’s four cardinal virtues:  (1) courage (andreia), (2) self-control (sophrosyne), (3) justice (dikaiosyne) and (4) wisdom (phronesis / sophia), Ennead I 4 focuses on the excellence of the wise man (spoudaios) and the nature of well-being (eudaimonia; see also Ennead I 5 [36] On Whether Well-Being Increases with Time). Furthermore, Ennead I 3 follows chronologically Ennead I 2 and actually supplements Plotinus’ ethical analysis on virtues with special reference to the advantages of the Platonic dialectic in contrast to the Stoic and the Aristotelian logic. Plotinus highlights the significance of Plato’s dialectic in respect to soul’s intellectual purification and its aim for noetic ascent.

In addition, Plotinus’ discussions on the nature of evil in Ennead I 8; on the metaphysics of beauty in Ennead I 6; on the philosophical comparison between the nature of human being and that of other living beings in Ennead I 1; and the very short treatise Ennead I 7 on the Platonic Good qua the primal good of all aspiration in life, include significant elements of his ethical theory. Finally, important implications of Plotinus’ virtue ethics are highlighted in his theory of transmigration (see particularly Enneads I 1.11; III 2.15), his criticism of Gnosticism in Ennead II 9 [33] Against the Gnostics, as well as his conception of human freedom and self-determination, particularly maintained in Ennead VI 8 [39] On Free Will and the Will of the One. Plotinus’ virtue ethics is further developed and systematized by later Neoplatonists such as Poprhyry (Sententiae ad intelligibilia ducentes 32) and Iamblichus (On Virtues).

Porphyry’s pupil Iamblichus, in his work On Virtues (the work is not preserved today), developed further the scale of virtues of Sententiae 32 (Finamore 2012; O’ Meara 2003). Iamblichus maintained a sevenfold scale of virtues. By tracing back to the Middle Platonists (Baltzly 2004), he added two more groups of virtues below the political and one group of virtues at the highest level above the paradigmatic virtues. Iamblichus’ classification developed from low to high with the following virtues: (1) natural, (2) ethical, (3) civic, (4) purifying, (5) contemplative, (6) paradigmatic, and (7) hieratic. Iamblichus’ scale of virtues testifies to the importance of theurgy for later Neoplatonists and influences St. Augustine’s early thought (Kalligas 2014).

2. Becoming like God

Plotinus’ treatise Ennead I 2 On Virtues opens with a question on the soul’s escape from evils in the earthly world: “Since it is here that evils are, and they must necessarily haunt this region, and the soul wants to escape from evils, we must escape from here. What, then, is this escape?” (I 2.1.1-3) For Plotinus, the answer should be found in Plato’s Theaetetus 176a-b, “to become as like God as possible”, and soul’s likeness to God should be related to the virtue of wisdom qua the highest ruling principle of the universe and the world soul (I 2.1.3-10). The passage from Plato’s Theaetetus marks Plotinus’ exposition of virtues in Ennead I 2 (2.1-10; 3.1 ff.; 5.1-2; 6.1-11; 7.27-30; see also Ennead I 8.6-7 on the necessity of evils) and Armstrong, in his introductory note on the treatise, emphatically regards Ennead I 2 as a commentary on the passage from the Theaetetus. In addition to Plato’s reference, Aristotelian and Stoic elements have been identified in Plotinus’ theory of virtue as well as some Neo-Pythagorean influences. Plotinus’ approach to “becoming like god” is discussed by later Neoplatonists such as Porphyry, Iamblichus, and Proclus (Baltzly 2004).

A careful reading of the first lines of Ennead I 2 shows a divergence from Plato’s assertion in the Theaetetus 176a-b that likeness to god is achievable up to a certain point. This difference seems to be not without purpose for the Neoplatonist and explains Plotinus’ interpretation of likeness to god in the same way as the Middle-Platonist Eudorus of Alexandria interpreted godlikeness “in virtue of that element in us which is capable of this”, and signifies the purpose of human life in Pythagoras, Socrates, and Plato (Dillon 1983). However, despite the omission of Plato’s qualification, Plotinus appropriately conceives the meaning of the Theaetetus passage as it is related to soul’s purification and the divine excellence of the virtuous life (Kalligas 1984; see particularly I 2.6.9-10, 7.24 and II 9.9.50-1). Plotinus’ metaphysics of power justifies the possibility of the virtuous soul to ascent to the higher intelligible realm without inherent limitations or qualifications. Plotinus actually diminishes Plato’s qualification of the Theaetetus since the soul’s noetic and complete likeness to the god is possible. Plotinus puts an emphasis on the intelligible purity of the soul and the power of virtue to lead the human mind to noetic ascent and the higher intelligible principles; our virtues are intelligible powers in the soul and derive from the divine Intellect, so the soul is able to return to the intelligible realm of the Forms and become like the divine Nous. The goal of the virtuous and wise person is to become godlike (II 9.15.40). The wise person is likened through virtue to the self-sufficient, perfect, and pure life of the intelligible world.

3. Disposition and Intellectual Qualities

Aristotle in his Nicomachean Ethics (1106b36-1107a1) defines virtue as a “disposition” (hexis) of the soul that is concerned with deliberate choice. The disposition of the soul underlies moral action in terms of moderation (mesotes), that is, the appropriate mean between the two extremes of deficiency and excess (1107a2-6). Aristotle emphasized the habitual aspect of disposition both in terms of ethical exercise (praxis) and the desired excellence of the moral agent.

Plotinus, in the fifth chapter of Ennead VI 8 On Free Will and the Will of the One, defines virtue as a hexis of the soul, but not in habitual terms. The Neoplatonist stresses the intellectual qualities of virtue not in terms of ethical practice but mainly in terms of contemplation. Virtue is a hexis not in the dispositional sense of ethical praxis but as an active state of the soul, a contemplative disposition that “intellectualises the soul” beyond ethical practice: “being in our power does not belong to the realm of action but in intellect at rest from activity” (VI 8.5.35-36). Plotinus underlines a self-directed perspective of the moral soul’s power of virtue. Virtue intellectualizes the soul in its internal contemplation of Nous and not in external considerations. The Plotinian hexis is not found in the moderation of praxis but in the soul’s conscious apprehension of being, and particularly in the middle region of the soul, in between the higher intelligible and the lower perceptible regions of the psyche. The Plotinian virtue is an active hexis that consciously directs the soul in the contemplation of the intelligible world of the Forms.

4. The Cardinal Virtues

In Ennead II 9, Plotinus acknowledges the inherent value of virtue: “if we talk about God without true virtue, God is only a name” (15.40). For Plotinus, every virtue is purification, and the purified soul becomes both form and forming principle. The virtuous soul noetically ascends without body to the divine realm of Nous, the world of true goodness, intelligence, and beauty (I 6.6). In Ennead I 6 On Beauty, Plotinus particularly refers to the four cardinal virtues found in Plato’s teaching­—wisdom, justice, self-control, and courage: “For, as was said in old times, self-control, and courage and every virtue, is a purification, and so is even wisdom itself.… For what can true self-control be except not keeping company with bodily pleasures, but avoiding them as impure and belonging to something impure? Courage, too, is not being afraid of death. And death is the separation of body and soul; and a man does not fear this if he welcomes the prospect of being alone. Again, greatness of soul is despising the things here: and wisdom is an intellectual activity which turns away from the things below and leads the soul to those above” (I 6.6.1-13).

In Ennead I 2, Plotinus focuses on the four cardinal virtues, emphasizing their intellectual and contemplative nature. However, as Smith claims, Plotinus’ aim is not to suggest a fixed scale of virtues but an ascending schema of levels of the cardinal virtues in relation to the different levels and aspects of humanity’s ethical and intellectual life (Smith 2004). Plotinus approaches the cardinal virtues from the following aspects: (P1) civic life (I 2.1.17-21), (P2) the purification of the soul in relation to the body (3.15-9), (P3) soul’s contemplation of the higher intelligible world (6.12-27), and (P4) the intelligible purity and goodness of the Forms (7.3-6). Despite the fact that Plotinus’ treatment of the four cardinal virtues in Ennead I 2 is not necessarily systematic, the following accounts are identified in relation to the four levels above.

a. Courage

In civic life courage deals with soul’s emotions (P1); in the process of purification courage is characterized by soul’s fearlessness to depart from the body (P2); in a contemplative state, courage is the virtue that frees the soul from lower affections in likeness of soul’s higher intelligible part (P3); and the highest intelligible level of Nous courage is identified with “immateriality and abiding pure by itself” (P4).

b. Self-control

The agreement and harmony of the soul’s passion and reason underlies self-control as civic virtue (P1). Whereas at a purificatory level self-control means that the soul is not sharing bodily experience (P2), at the level of contemplation it means the soul’s inward turning to Intellect (P3), and at the highest intelligible level of the Forms, self-control is identified with self-concentration (P4).

c. Justice

In civic life, justice is defined in Plato’s terms as the virtue that facilitates the agreement between the different parts of the soul in “minding their own business where ruling and being ruled are concerned” (P1), while in purificatory terms, justice is purely ruled by reason and intellect “without opposition” (P2). However, at a contemplative level, justice is not found in the plurality of the soul’s parts but in the disposition of the unity to itself and so “the higher justice in the soul is its activity towards Intellect (P3). At this pure intelligible level, justice entails the soul’s proper and paradigmatic activity in minding its own business beyond any plurality (P4).

d. Wisdom

As a civic virtue, phronesis, practical wisdom, is related to the discursive reason of the soul (P1), while as a purificatory virtue, it refers to the soul “acting alone” outside the experience of the body and mere opinion (P2). In a contemplative person, practical wisdom and theoretical wisdom (sophia) involve the contemplation of the intelligibles, that is, what the divine Intellect contains and possesses in immediate contact. Plotinus discriminates between the wisdom of Intellect and that of the soul; wisdom, as with all virtues, is not a virtue in Nous but manifests only in the soul. Wisdom in Intellect is its pure actuality (=intelligence) and what it really is (=being), in the soul, wisdom derives from Nous but is directed to other things (P3), and so the paradigm of wisdom is related to pure intelligence and knowledge manifested in soul’s direct sight towards the hypostasis of Intellect (P4).

In Ennead I 3, Plotinus further distinguishes between higher virtues and lower virtues. Plotinus maintains that the higher virtues are interrelated and correspond to intelligible Forms, which are not virtues themselves, but contribute to the noetic ascent as well as the practical and theoretical excellence of the soul. Moral philosophy is not only about intellectual virtues but also deals with the production of the appropriate dispositions and exercises (I 3.6). However, the higher virtues contribute to the purification of the soul and “moral philosophy derives from dialectic in its contemplative side.” Dialectic is the purest part of intelligence and wisdom that guides the soul to knowledge and apprehension in correct order and reason. As Plotinus maintains in Ennead I 2, if all virtues are purifications, the process of purification produces and perfects all virtues, and so the one who possesses the greater intellectual virtues must necessarily have the lesser civic virtues. However, this is not admitted to the one who possesses the lesser virtues. The intellectual virtues complete the lower virtues and not vice versa (I 3.6.5-7). For Plotinus, well-being (eudaimonia) is achieved only with the excellence of the higher virtues that lead to the intelligible world.

5. Well-being

In Ennead I 4 On Well-Being, Plotinus criticizes Aristotle’s primal relation of well-being (eudaimonia) with practical accomplishment, proper functions, and the achievement of natural ends (chapter 1).  Plotinus is also skeptical with the Stoics and the Epicureans (chapter 2); eudaimonia should not be related either to the Stoic ‘extirpation of passion’ (apatheia) and the “study of primary natural needs which perfects reason”, nor to the ataraxic pleasure of an unworried state of mind (ataraxia) supported by the Epicureans.  As a devoted Platonist, Plotinus returns for an answer on the question of eudaimonia to the original teaching of Plato. Plotinus follows Plato’s metaphysical perspective on eudaimonia in relation to the contemplation of the Forms. For Plotinus, well-being (eudaimonia) is not achieved primarily in ethical practice (praxis), as Aristotle suggested, but mainly through the noetic ascent of the soul and in contemplation (theoria) of true being in the intelligible realm of Nous. The wise person (spoudaios) has to become godlike (see Ennead I 2.1) to be eudaimon, that is, to live the perfect life of Intellect, the life of the higher soul purely contemplating the eternal reality of Nous. The real virtue of the wise is to be aware of the perfection, self-sufficiency, and completeness of Intellect, the intelligible reality where the soul is truly purified beyond discursive reason and consciousness (I 4.3.34-41).

The excellence of virtue is achieved not by having intellect but by being intellect; the perfectly virtuous soul of the wise is self-sufficient, ascends purified to the intelligible world and so likens itself to Intellect’s divine and eternal eudaimonia (I 4.4). However, Plotinus clarifies that the meaning of likeness (homoiosis) in the wise and good person is not the likeness of two pictures in perceptible terms but the intelligible likeness of the soul to the divine model of Nous different from our perceptible self  (I 2.7.28-31). Hence, the soul of the wise man, purely concentrated on the divine realm, is not affected either by the sufferings or the misfortunes of the animated body (I 4.5-8), nor in any way influenced by the lower life of the material world (I 4.9). The spoudaios experiences a life in noetic purity guided only by the higher intelligible part, and any disturbances from the lower perceptible part hardly trouble the wise person (I 2.5.22 ff.). Any kinds of affections from the perceptible part of the soul are dim echoes for the mind of the wise man just because of the affinity between the two parts within the soul. The lower part is always benefited within the soul of the spoudaios “just as a man living next door to a sage would profit by the wise man’s neighborhood, either by becoming like him, or by regarding him with such respect as not to dare to do anything of which the good man would not approve” (I 2.5.25-28). However, the wise person is not careless about the perceptible body despite the fact that bodily goods will not contribute to eudaimonia; the wise person has to give to the body what the body really needs (I 4.11-16). The concern of the wise person is “not to be out of sin, but to be God” (I 2.6.2-3), and so the virtue of the wise person that leads to true well-being is to exercise the higher activity of the soul’s intelligible self; true arete frees the spoudaios and leads the soul to the ultimate goal to become like the higher intelligible and eternal life of the divine Nous (I 2.6-7).

The wise man is also not inconsiderate to others (I 4.15.21-25) but does not belong to the mass of people (II 9.9.6-11). He chooses to be acquainted with virtuous friends and he is the paradigm of excellence and contemplative life. As Plotinus notes, the spoudaios is not “unfriendly” (aphilos) or careless about others, but he cares about his own soul as he cares about his own affairs and the excellence of his companions. The wise man manifests intelligible unity and purity by being an earthly paradigm of the divine Nous, and so “renders to his friends all that he renders to himself, and so will be the best of friends due to his union with Nous” (I 4.15.21-25). The wise man shares his eudaimonia by being present at the same time to his own self and the others (See Porphyry Life 8.19-23), and lives a friendship (philia) in the sensible world that imitates the friendship of the universal order and the higher divine realm of Nous and its unity with its Forms (VI 7.14). The power of philia traverses all the hypostases of being as it is identified with and derives from the supreme unity of the One (V 1.9.1-5).

6. Human Freedom and Self-Determination

Plotinus’ theory of virtue ethics is closely related to human freedom and self-determination. In the beginning of Ennead VI 8 On Free Will and the Will of the One Plotinus wonders, “Is there anything in our power?” In his analysis a distinction is offered between internal determinations (that is, what depends on us) and external determinations (that is, what is not dependent on us) (Eliasson 2008; Remes 2006). An action is voluntary and depends on us not only if we are free and we are not obliged to act, but also if we are not following the path of reason without critical evaluation. For Plotinus, an action depends purely on us only if the soul defines its own self as a self-determined principle (VI 8.3.20-26).

Plotinus’ notion of self-determination is related to the concept of “what depends on us” (eph’ hemin) as having the connotation of a faculty describing either the quality of action or the agent himself (Leroux 1990). Furthermore, a distinction has been suggested between an inclusive notion of “what depends on us” (that is, the moral action has its origins in the agent) and an exclusive notion (that is, the moral action has its origins in rational decisions and judgments not necessarily determined by the agent) (Eliasson 2008). For Plotinus, voluntariness and awareness of an action are not sufficient for an action to be depended on us, but from our wish coming through the contemplation of virtue.

Furthermore, for Plotinus, moral actions that are determined by external factors are related to passive dispositions, but true virtue should be based on the internal state of the soul in relation to intellect (II 5.2.34-35). Moral agency reveals itself not primarily in ethical practice but in the excellence of the inner self in active contemplation of the Forms (II 3.9-10). The virtuous soul is purely dependent on its own self without considering external conditions or determinations; the free soul is self-determined only by internal conditions (III 1; see also VI 8.3.20-26) and acts autonomously in self-determination inconsiderate of external parameters or situational conditions (VI 8.6.19-23). The virtuous action is underlined by three conditions: firstly, an action is voluntary (that is, we should not be forced to act); secondly, an action must be conscious (that is, we should have knowledge of what we are doing); and, thirdly it must be self-determined (that is, we should be masters of ourselves) (Eliasson 2008). Considering a self-directed aspect of moral agency, Plotinus moves the emphasis from the outward activity of ethical practice (that is,  Aristotle’s primary concern in relation to virtue ethics) to the inner activity of the contemplating soul. A free and noble action is not justified or based mainly on practice (praxis), but on the intellectual virtues of the soul as qualities of its intelligible self prior to moral action that is found in the perceptible realm (VI 8.6.20-22). Virtue is an active disposition of the soul in terms of contemplation (theoria) that ends in an established state of mind internally tuned and moderated in accordance to the perfection of the intelligible world. In light of this approach, well-being is not found in actions but in the inner contemplation of the soul. As Plotinus puts it, “To place eudaimonia in actions is to locate it in something outside virtue and the soul; the activity of the soul lies in thought, and action of this kind within itself; and this is the state of eudaimonia” (I 5.10.20-23). True happiness of a free and moral soul is not established in external situations and activities but in internal determinations and intellectual virtues (I 5.1).

Moreover, whereas Aristotle conceives of human freedom as related to the problem of choice and contingency, Plotinus of conceives human freedom in relation to the freedom of the self (Leroux 1996) and the virtuous life of the wise person, without necessarily being defined by or dependent on voluntary choice (Ennead VI 8.1-7). Plotinus emphatically argues that no outward actions are purely dependent on us: “in practical actions self-determination and being in our power is not referred to practice and outward activity but to the inner activity of virtue itself, that is, its thought and contemplation” (VI 8.6.20-22). What depends on us can be found in the realm of intellect “at rest from actions” (VI 8.5.35-37). Only virtue as an intellectual quality purifies and frees the soul, and as Plotinus states by following Plato’s expression in the Myth of Er in the Republic (617e3), virtue has “no master” as far as it intellectualizes the soul beyond any external determination (VI 8.5.30-37).

7. Soul and Transmigration

Plotinus’ virtue ethics is a self-directed ethical theory that is related to his psychology and metaphysics. His ethical theory follows his theory of the psyche and its dual-aspect nature. The higher and lower virtues correspond to the higher intelligible and sense-perceptive parts of the soul (Ennead I 3). Whereas the lower virtues are related to passions and the lower sense-perceptive part of the soul, the higher virtues are related to wisdom and dialectic and refer to the higher intelligible part of the soul (I 3.6). Plotinus aims to stress the superiority of the soul’s higher intelligible part, which is its inner self and is contrasted with the soul’s sufferings and passions of the lower sense-perceptive part, which is related to the outer self. He maintains that tragic and cruel moments in life should not be taken seriously but should be regarded as incidents in the plot of a play: “we should be spectators of murders, and all deaths, and takings and sacking of cities, as if they were on the stages of theaters, all changes of scenery and costume and acted wailings and weepings” (III 2.15.43-47). It is not the soul’s inner self that participates in the “game of life” but “the outside shadow of man” (47-50). The higher soul remains unaffected by bodily conditions and so “the outer man has to take off the play-costume in which he is dressed” (55-57). The inner man is clear from affections, and this is our true self that possesses the virtues that belong to the realm of intellect and “have their seat actually in the separate soul, separate and separable even while it is still here below” (I 1.10.7-10).

Plotinus’ dual-aspect theory of the soul is related to his account of transmigration and its ethical implications. It is noteworthy that Plotinus never uses the term metempsychosis (reincarnation) but only metensomatosis (transmigration). Plotinus adopts a monistic view of transmigration. A monistic approach to transmigration agrees with the ontological unity and homogeneity of the soul and the non-eschatological aspect of human destiny. The transmigration of the soul should be conceived of as illumination of the living bodies. The soul is not literally transmigrated, since the bodies are just shadows and images of the higher soul. The bodies are projections of the soul and so transmigration is the illumination of the light of the soul transmitted into different bodily forms and without affecting the unity of the soul.

Plotinus stresses the ethical implications of transmigration originally found in the Platonic dialogues (Phaedo 81-82; Republic X. 620; Timaeus 91-92). However, in light of the soul’s ontological unity, homogeneity, and monism, Plotinus aims to reconcile some dualistic accounts of transmigration found in Plato, the early Pythagoreans, and some Presocratics such as Heraclitus and Empedocles (VI 4.16.4-7). His intention is to abolish the barriers between different psychic classes and hierarchies. Since the soul is one, homogenous, intelligible substance of life, all transmigrations into various life forms are possible (humans, animals, plants) and by extension, all animated bodies are rational and immortal (IV 7.14.1-8; see also VI 4.16; IV 8.1; III 4.2.16-30).

Plotinus’ reconsideration of Plato’s accounts of transmigration also has an ethical side. The logos of the soul manifests at different facets of life and being: the man who exercised political virtue becomes a man again, while the one who is not active in community becomes a bee; the man who loved music a song-bird; kings who ruled stupidly into eagles; those who lived with the senses animals; even plants for those who lived with the desire of flesh coupled with dullness of perception (III 4.2). Nevertheless, for Plotinus, whereas the transmigration of human souls into animal bodies is possible, the soul’s destiny has nothing to do with transmigration (I 1.11.8-15). It is not physical condition that affects the soul but the moral quality of the soul that affects the physical order, both of individual bodies and the cosmos. Plotinus denies an eschatological approach to transmigration for the soul’s higher intelligible part. As an intelligible entity, the soul is pure and immortal logos and thus sinless in its very nature (I 1.12.1-4). Since the soul is sinless it cannot be judged or punished in after-life nor transmigrated by passing from body to body. The higher part of the soul never descends completely to the lower realm of the sensible world (IV 8.8), while the lower part is a shadow of the higher part, and the descent of the soul is an inclination of the intelligible part in the realm of becoming (I 1.12). The dual-aspect nature of the soul is vividly described in Ennead I 1, where Plotinus uses the dual image of the noble and virtuous hero Heracles who “had this active virtue and in view of his noble character was deemed worthy to be called a god—because he was an active and not a contemplative person (in which case he would be altogether in that intelligible world), he is above, but there is also still a part of him below” (35-39).

Whereas Plotinus accepted transmigration of the soul in different forms and in terms of the soul’s purity and immortality while denying the soul’s bodily affection and sin, later Neoplatonists interpreted transmigration in different ethical terms: the evil man becomes a beast-like character and the sinful soul is temporarily associated with an animal body or form. This is actually the central point of controversy between Plotinian and post-Plotinian accounts of transmigration. Whereas, for Plotinus, the ethics of transmigration is based on the non-hierarchical monism and homogenous, intelligible nature of the soul, for later Neoplatonists, transmigration is denied in terms of a hierarchical ontology in which the human soul possesses a higher ranking of existence in comparison to the other animals. On the one hand, Poprhyry seems to follow Plotinus’ transmigration of human soul into animal bodies as far as both human soul and animal souls are rational, deriving from the same intelligible source of the soul as second hypostasis of being (Smith, 1987; Wallis, 1995). On the other hand, Iamblichus and Proclus rejected human transmigration to animals as far as human and animal souls are essentially different and even denied that animals have souls at all in the strict sense of the term (Wallis, 1995).

8. Criticism of the Gnostics

In Ennead II 9 Against the Gnostics, Plotinus aims to defend Platonism against the immoral, pessimistic, and irrational doctrines of those who misinterpret Plato’s teaching and attribute evilness and darkness to the material universe (Puech 1960). Plotinus’ criticism is directed to a group of Gnostics who argued that knowledge should be not considered as a product of philosophical reasoning but of divine revelation (Wallis 1995). The Gnostics generally maintained that salvation is possible only through ‘knowledge’; gnosis is the only presupposition for the soul to find the pleroma, the spirit of the supreme God beyond this lesser and evil material universe. In order to emphasize the Gnostics’ irrational doctrines and perhaps their hypocritical and hyperbolic attitude, Plotinus describes them as speaking about Plato’s theories with “raving words” (II 9.18.20), like Sibyl’s delirious speech, as Heraclitus vividly expresses in fr. 92. In contrast to the Gnostics and other misinterpretations of Plato, Plotinus maintains that the material universe is the most perfect possible image of the intelligible world; the material world reflects in the best possible way the beauty and goodness of the divine realm.

Plotinus evaluates the Gnostic conceptions of the world, history, and ethics in three corresponding forms of alienation: firstly, alienation from the world, secondly, alienation from history, and thirdly, alienation from society (Kalligas 1997). Moreover, Plotinus’ objections are directed to the Gnostic doctrines of the denial of the divinity of the Word-Soul and the heavenly bodies, the rejection of salvation through true virtue and wisdom, the non-philosophical and irrational support of their arguments, and the arrogant view of themselves as saved by nature, that is, as privileged beings in whom alone God is interested (See Armstrong’ introductory note on Ennead II 9; cf. also Wallis 1995). For Plotinus, the Gnostics are deceived when they believe that the universe is created by a fallen soul (II 9.4-5) and when they speak of the divine creator as an ignorant or evil Demiurge who produced an imperfect material world (II 9.6). They are mistaken when they regard the creative activities of the Demiurge as the result of a spiritual fall within the intelligible hierarchy (II 9.10-12); they are melodramatic when they speak about the influence of the cosmic spheres (II 9.13); they are in the wrong direction when they lay claim to the higher powers of magic (II 9.14); and they are completely misled when they believe that immortality achievable through the complete rejection of and abstention from the material world.

Ennead II 9.15-18 includes an important account of Plotinus’ ethical criticism of the Gnostic movement. However, it is not only concerned with a polemic against Gnosticism but also with a defence of Platonism against the immoral, irrational, and pessimistic doctrines of negative otherworldliness. Plotinus draws a line between virtue, beauty, and truth, emphasizing Plato’s teaching of ethics, aesthetics, and metaphysics. Plotinus’ criticism of Gnosticism is an abridgment of his virtue ethics where the meaning of arete is justified for its importance for the soul’s purification, unity, and self-improvement. Plotinus shows his ethical standpoint on the value of human life. The life of the wise and virtuous soul is not to abandon the material world in a disinterested way of life, but to understand through virtue the divine origins of the soul and recognize the beauty and goodness of the intelligible world in the soul’s self-perfection.

Particularly, in chapter 15, Plotinus states that “we must be particularly careful and not to let escape us” what the immoral arguments of the Gnostics do to the souls (15.1-3). He distinguishes between two theoretical directions about the “end” (telos) of life (15.4-8): whereas, for the first, the end is the pleasure (hedone) of the body, for the second, the end is nobility (kalon) and virtue (arete).  Plotinus further divides the first theoretical direction into two schools of thought: (1) Epicurus and the Epicureans, who abolish divine providence and extol pleasure and enjoyment (8-22); (2) the Gnostics, who are pessimistic about the material world and promote an ascetic life without virtue and goodness (22-40). Prima facie the classification of the Epicureans and the Gnostics into the same category is puzzling: whereas the Epicureans were known for their hedonistic views, the Gnostics were known for their ascetic and detached views. Probably, Plotinus’ aim is to offer a philosophical comparison in a dialectic form in order to answer two dissimilar schools of thought, both of which, however, omit virtue ethics and divine goodness. According to another perspective, Plotinus perhaps considers a common alienated attitude both in the Epicurean life of pleasure and in the Gnostic life of asceticism.

For Plotinus, the Gnostics are immoral for neglecting the role of virtue in human life and noetic ascent. The Gnostics omit to define virtue, and they fail to explain how to attain the higher world without virtue. No treatise is devoted to virtue, and their treatment of virtue is completely absent from their doctrines: “they do not tell us what kind of thing virtue is, nor how many parts it has, nor about all the many noble studies of the subject to be found in the treatises of the ancients, nor from what virtue results and how it is to be attained, nor how the soul is taking care of itself, nor how it is purified.” (II 9.15.30-33) Plotinus argues that “looking to god” without knowing how to look is insufficient because only virtue leads the soul to the goal of divine aspiration (15.33-40).

Plotinus further relates virtue to beauty and the divine (II 9.16-18). Perceptible beauty is a reflection of the intelligible beauty, and the wise soul is able to recognize the beauty and goodness of the intelligible world through an inner sight to the perceptible world (II 9.1639-48).  Plotinus justifies the difference between Platonic and Gnostic otherworldliness. Whereas Plato’s otherworldliness accepts the beauty and goodness of the material world (in Plato’s Timaeus), Gnostics’ otherworldliness denies the beauty of the universe and the divine goodness of the Demiurge (II 9.17). Plotinus defends Plato and the beauty of the earthly world by using the metaphor of two people living in the same fine house, “one of whom reviles the structure and the builder, but stays there none the less, while the other does not revile, but says the builder has built it with the utmost skill, and waits for the time to come in which he will go away, when he will not need a house any longer” (II 9.18.3-9; trans. Armstrong).

Virtue forces the soul to recognize both itself and its divine origins and to guard itself against the strokes of fortune (18.26-30). The higher soul of the universe is not troubled; “it has nothing that it can be troubled by. We, while we are here, can already repel the strokes of fortune by virtue and make some of them become less by greatness of mind and others not even troubles because of our strength”; when our soul contemplates the completely untroubled state of the world soul, the universe and the stars, we become our true selves, well prepared for any possible misfortune (30-35) (see also Ennead I 4.8).

9. Virtue Ethics after Plotinus

a. Porphyry

In Sententiae ad intelligibilia ducentes (section 32) Porphyry systematized Plotinus’ treatment of the four cardinal virtues exposed in Ennead I 2 (O’ Meara 2003; Kalligas 2014). Porphyry stressed the importance of purification in virtue ethics and particularly the significance of purification in self-knowledge and the care of soul. He underlined the necessity of detachment from the soul’s bodily pleasure and irrational passions, its inconsideration of pains produced by sense-objects, and any kind of inclination on the part of the soul to the corporeal world. For Porphyry, the virtuous soul achieves impassibility by completely removing bodily dispositions (Sententiae 32.89-140).

Porphyry suggested a specific scale of the cardinal virtues following an ascending exposition of the soul’s need for purification: from the lower civil and practical life of the earthly realm to the higher paradigmatic life of the intelligible Forms. The scale of virtues begins (P1) from the level of political virtues and civic life, continues (P2) to the level of purificatory virtues and soul’s primal noetic ascent, (P3) to the theoretical virtues of the contemplative mind, and (P4) to the exemplary or paradigmatic virtues of the intelligible world.  The cardinal virtues of courage, self-control, justice, and wisdom apply throughout the four levels or states of being. Whereas the object of the “civic virtues” (P1) is to moderate passions and to conform conduct to the laws of human nature, the “purificatory virtues” (P2) detach the soul completely from passions. The object of the “contemplative virtues” (P3) is to apply to the soul pure intellectual activities, without any concern about passions, while the paradigmatic virtues (P4) are the exemplars and archetypes of all other virtues (Sententiae 32.83-89)

Porphyry begins section 32 of the Sententiae with the application of virtues to different states of human experience and focuses on different expressions of virtues with respect to different levels of purification: between the virtues of the citizen, the virtues of the soul that attempts to rise to contemplation, the virtues of the soul that purely contemplates intelligence, and finally the mind that possesses pure intelligence and that is completely separated from the bodily level of the soul (1-5). As Porphyry summarizes, “the practical virtues make man virtuous (spoudaios); the purificatory virtues make man divine (daimonios), or make the good man a benign spirit (daimon agathos); the one who acts only in accordance to contemplative virtues becomes a god (theos); while the one who acts in accordance with the paradigmatic virtues is the father of gods (theon pater)” (89-94 translation Guthrie, 1988, modified).

Furthermore, Porphyry places an emphasis on the political virtues and their civic importance as the first stage of excellence in terms of moderation of the passions (metriopatheia) and appropriate moral duty underlined by pure reason. For Porphyry, political virtues contribute to the harmonious civic life with fellow human beings and “mutually unite all citizens”. The political virtues are human virtues and a necessary precondition to the noetic ascent of the soul to the higher realms. There is a necessity to exercise humanity in the self before its application to fellow-humans or the purification at higher levels of being (Sententiae 32, 6-14).

For Porphyry, the contemplative man is detached from the political sphere and the virtues possessed are called “purifications” since they aim at higher realities and genuine existences. The soul of the contemplative man is raised above the passions of the earthly life to the intelligible realm and in likeness to the divine. (15-33). However, as Porphyry clarifies, there is “a difference between purifying oneself, and being pure” (33-35). The role of purificatory virtues is twofold: they both purify the soul and coexist as qualities in the purified soul. The importance of the purificatory virtues lies in their power to release the soul completely from any form of evil, either the one related to lower things or the one related to passions. The political virtues release the soul only from passions (35-50).

At a level higher than the purficatory virtues, Porphyry places the contemplative virtues, “the virtues of the soul that contemplates intelligence”. The purified soul directs its activities to the higher intelligible realm, and the four cardinal virtues manifest different kinds of qualities of the soul in constant contemplation of the intelligible beings (51-63). Finally, Porphyry suggests a fourth kind of virtues, the paradigmatic virtues, which belong to the realm of the Forms and reside within the higher Nous.

At the intelligible level of the Forms, the virtues are identified with specific intelligibles (noeta). Porphyry’s claim has been considered a departure from Plotinus’ position in Ennead I 2 (by following Aristotle) that virtues should not be seen as archetypes in the intelligible world of the Forms. However, Porphyry follows Plotinus in claiming that one who possesses the superior virtues also possesses the lower virtues, but not vice versa. In fact, one who possesses the higher virtues is not interested in practicing the lower virtues. Furthermore, Porphyry underlines the intrinsic value of virtues by upgrading their ontological status, while Plotinus highlights their psychological value in the soul’s noetic purification.

For Porphyry, the superiority of the paradigmatic virtues, compared with the virtues of the soul, lies in the fact that the virtues of the soul are images of the “archetypal” paradigmatic virtues, and so they subsist in the divine Nous simultaneously (63-70). As Porphyry synopsizes: “1, the paradigmatic virtues, characteristic of intelligence, and of the being or nature to which they belong; 2, the virtues of the soul turned towards intelligence, and filled with her contemplation; 3, the virtues of the soul that purifies herself, or which has purified herself from the brutal passions characteristic of the body; 4, the virtues that adorn the man by restraining within narrow limits the action of the irrational part, and by moderating the passions” (70-78; translation Guthrie 1988, modified).

b. Iamblichus

Iamblichus, in his work On Virtues (not preserved today), develops the scale of virtues of Porphyry’s Sententiae 32 (O’Meara 2003; Kalligas 2014; Finamore 2012). Iamblichus added two more virtues below the political: the natural virtues (at the lowest level) and the ethical virtues (below the political virtues), as well as the hieratic virtues at the highest level of the scale. Iamblichus’ scale of virtues, following an ascending order is: (1) natural; (2) ethical; (3) civic; (4) purifying; (5) contemplative; (6) paradigmatic and (7) hieratic (apud Damascius, In Phaedo I.138-144).

Iamblichus suggested a level below the civic or political virtues in order to underline the importance of virtues and their cultivation both in children and certain animals (O’Meara 2003). He emphasized the importance of “habituation” (ethismos) at the level of ethical virtues and highlights the classical association between virtue and habituation (hexis) found in Plato and Aristotle. However, Iamblichus, following closely Plato’s teaching, reconsiders the educational importance of the ethical virtues in molding and bringing up children. A virtue ethics education is presupposed for political virtues, which entails maturity and rationality (O’Meara 2003).

In Iamblichus’ canon— a selection (or curriculum) of twelve Platonic dialogues used to initiate the student to Plato’s original teaching—the scale of virtues is important for the soul’s purification and its progressive noetic ascent from the nature of the self (Alcibiades I) to a complete treatment of the divine nature (Parmenides). The virtues follow the purpose of the Platonic dialogues and the order of being to guide the soul’s likeness to god. For instance, the Gorgias, the second dialogue in the list, involves civic virtues, while the Phaedo, the third dialogue in the list, involves purificatory virtues. Moreover, whereas the Cratylus and the Theaetetus, fourth and fifth in the list, refer to contemplative virtues and emphasize logic, the Sophist and the Statesman, sixth and seventh in the list, refer again to contemplative virtues but with an emphasis on nature and the perceptible world. The dialogues Phaedrus, the Symposium, and the Philebus (eighth, ninth, and tenth respectively in the list of the curriculum) are related to contemplative virtues with theological purposes and the nature of the Good. Finally, the Timaeus, eleventh in the list, entails physical education with reference to the nature of the cosmos.  It is noteworthy that the Timaeus and the Parmenides are considered “perfect” dialogues, which sum up the previous ten.

Iamblichus’ detailed development of the scale of virtues offers a comprehensive and insightful analysis on human morality, from the natural level of being to the highest form of divination. As in Plotinus and Porphyry, the ascending scale of virtues follows the noetic ascent and the progressive purification of the virtuous soul that achieves likeness with the divine. As O’Meara (2003) maintains, “the Iamblichean scale of virtues remains a method of progressive divinization, a process of complexity worthy of the metaphysical world-view of the later Neoplatonists” (145). Iamblichus’ fine elaboration of virtues was influential on the work of Marinus and Damascius and shows the importance of human excellence both in the practical and the theoretical sphere.

10. References and Further Reading

a. Texts and Translations

  • Armstrong, A. H. (1966-1988) Plotinus. 7 vols. Loeb Classical Library. Cambridge: Harvard University Press.
  • Henry, P., and Schwyzer, H. R. (1964, 1976, 1982) Plotini Opera. 3 vols. (editio minor) Oxford: Clarendon Press.

b. Commentaries

  • Kalligas, P. (2014) The Enneads of Plotinus: A CommentaryVolume 1, Elizabeth Key Fowden and Nicolas Pilavachi (trs.), Princeton; Oxford: Princeton University Press.
  • Kalligas, P. (1994-2009) Plotinus’ Enneads I-V: Ancient Greek text, translation and commentary. Athens: Academy of Athens.
  • Leroux, G. (1990) Traité sur la liberté et la volonté de l’Un. [VI.8 (39)] Paris: Vrin.
  • McGroarty, K. (2007) Plotinus on Eudaimonia. A Commentary on Ennead I.4. Oxford: Oxford University Press.

c. Introductory Sources

  • Gerson, L. P. (1994) Plotinus. London/New York: Routledge.
  • O’Meara, D. (2003) Platonopolis. Platonic Political Philosophy in Late Antiquity. Oxford:  Clarendon Press.
  • Smith, A. (2004) Philosophy in Late Antiquity.  London/New York:  Routledge.
  • Wallis, R. T. (1995) Neoplatonism. London: Duckworth.
  • Wright, M. R. (2009) Introducing Greek Philosophy. Acumen.

d. Ethical Theory

  • Baltzly, D. (2004) ‘The Virtues and ‘Becoming Like God’: Alcinous to Proclus’, Oxford Studies in Ancient Philosophy 26: 297-321.
  • Dillon, J. M. (1996) ‘An ethic for the late antique sage’ in The Cambridge companion to Plotinus. Gerson, L. P. (ed.), Cambridge: Cambridge University Press, 315-335.
  • Dillon, J. M. (1983) ‘Plotinus, Philo and Origen on the Grades of Virtue’, in H.-D. Blume and F. Mann (eds), Platonismus und Christentum. Festschrift für Heinrich Dörrie, Münster Westfalen: 92-105.
  • Plass, P. (1982) ‘Plotinus’ ethical theory’, Illinois Classical Studies 7, 2: 241-259.
  • Remes, P. (2006) ‘Plotinus’ Ethics of Disinterested Interest’, Journal of the History of Philosophy 44: 1-23.
  • Rist, J. M. (1976) ‘Plotinus and Moral Obligation’ in The Significance of Neoplatonism. Harris, R. B. (ed.), Virginia: ISNS, 217-233.
  • Schniewind, A. (2003) L’Éthique du Sage chez Plotin. Le paradigme du spoudaios. Paris: Librairie Philosophique J. Vrin.
  • Smith, A. (1999) ‘The Significance of Practical Ethics for Plotinus’ in Traditions of Platonism: Essays in Honor of John Dillon. Cleary, J. J. (ed.), Aldershot, 227-236.
  • Stamatellos, G. (2015) ‘Virtue and Hexis in Plotinus’, International Journal of the Platonic Tradition 9 (2): 129-145.

e. Human Freedom and Selfhood

  • Eliasson, E. (2008) The Notion of That Which Depends on Us in Plotinus and Its Background. Leiden/Boston:  Brill.
  • Leroux, G. (1996) ‘Human Freedom in the thought of Plotinus’, in Gerson, L. P., The Cambridge companion to Plotinus. Cambridge: CUP, 292-314.
  • Remes, P. (2007) Plotinus on self: the philosophy of the ‘We’. Cambridge, New York: Cambridge University Press.
  • Stern-Gillet, S. (2009) ‘Dual Selfhood and Self-Perfection in the Enneads’, Epoché 13, 2: 331-345.

f. Post-Plotinian Virtue Ethics

  • Finamore, J. (2012) ‘Iamblichus on the Grades of Virtue’ in Iamblichus and the Foundations of Late Platonism. Eugene Afonasin, John Dillon, John F. Finamore (ed.), Leiden; Boston:  Brill, 113–132.
  • Guthrie, K. (1988) (tr.) PorphyryLaunching-Points to the Realm of Mind. Phanes Press.

 

Author Information

Giannis Stamatellos
Email: istamatellos@acg.edu
The American College of Greece
Greece

Aesop’s Fables

With the possible exception of the New Testament, no works written in Greek are more widespread and better known than Aesop’s Fables. For at least 2500 years they have been teaching people of all ages and every social status lessons how to choose correct actions and the likely consequences of choosing incorrect actions. However, because the fables do not fit the model of philosophy that would be developed later by thinkers like Plato and Aristotle and their successors, they are often disregarded by philosophers; and because they are regarded as having been written for children and slaves, they are often not taken seriously as a source of information about practical ethics in ancient Greece.

In order to provide some context for the fables themselves, after a brief introduction the first part of this article discusses the Life of Aesop, a pseudo-biographical text about the fables’ legendary author. Next, the article considers the form and content of fables, and how these limit what the fables can do while also providing opportunities that other forms of communication do not. Finally, the article looks at some specific fables and the messages that can be taken away from them, in order to demonstrate the kinds of ethical principles that the ancient Greeks conveyed using this kind of philosophizing—and which are still present in the fables that are read and recited around the world today.

Table of Contents

  1. Introduction
  2. The Life of Aesop
  3. Aesopic Fable as a Kind of Philosophy
  4. Philosophical Values in Aesopic Fable
    1. The Strong and the Weak
    2. Friends and Enemies
    3. Intelligence/Foolishness
    4. Overambition/Failure
    5. Truth/Honesty/Lies/Deceit
    6. Gods
    7. Reciprocity
    8. Women, Family, Love
  5. Conclusion
  6. References and Further Reading

1. Introduction

This article talks about the fables under consideration as “Aesopic” fables to show that they are attributed to Aesop while also being clear that Aesop is not necessarily their actual author. The ancient Greeks believed that there had once been a man named Aesop who was the originator of the fable and author of its earliest examples, and it became traditional to attribute all fables to him, just as Americans currently tend to attribute any clever remark to Mark Twain. However, there are at least two problems with this view of Aesop as the creator and author of fables. First, there is very little evidence to suggest that Aesop ever existed. This is not surprising, given that he allegedly lived during the sixth century B.C.E., centuries before the Greeks who were writing down his fables were born; and there is very little surviving evidence from that era about anything. In addition, the ancient Greeks were not scrupulous about historical detail—if something should have been written or said or done by a particular person, then they attributed it to that person. (For example, the Athenians attributed many laws to Solon, which are documented as being enacted well after his death.) There is a surviving pseudo-biography of Aesop that is discussed below, not for its historical accuracy or value, but in order to bring out some of the beliefs that the Greeks had about the kind of person who should have written the fables, because, as was noted above, these beliefs tell us something important about the fables themselves. Second, we know that Aesop could not have been the originator of the fable form because fables predate the Greek civilization of which he was supposed to have been a part by many centuries. Their origins are lost, in part, because they were orally transmitted for an unknown period of time before being written down, but (as has been said) stories that are clearly recognizable as fables have been found in tablets written in ancient Sumeria.

2. The Life of Aesop

Even though Aesop probably never existed, it is helpful in understanding how the ancient Greeks thought about the fables to understand who Aesop was thought to have been, and how he was thought to have lived his life. We can reasonably assume that the “life story” of the inventor of the fables developed along the lines that would have been found most compatible with what the Greeks thought the fables were. Therefore, by learning what the Greeks thought about the author of the fables, we can expect to learn something about what they thought about the fables themselves.

So, who was Aesop to the ancient Greeks? We know that Aesop was widely known in the ancient Greek world. We find references to him and his life in Herodotus, Plato, Aristotle, and Aristophanes, and while those references may not be historically accurate, they do show that the audiences for the works of these four men (a historian, two philosophers, and a comic playwright), which would have included citizens from a wide range of social classes, knew who Aesop was and could be expected to respond to references to him in predictable ways. It also shows that he was well known and important enough for these authors to decide that he was worth including in their writings in the first place, and this can only be because his life and fables were believed to be useful cultural material and worthy of attention.

Setting aside the references mentioned above, an extended account of Aesop’s life can be found in the pseudo-biographical Life of Aesop, which is believed to have been written in roughly the 2nd cn. C.E., although much of it is a compilation of older stories that were part of oral tradition (for example, the Life of Ahiqar). The details of his life, although they may be entirely fictional, are important because while today we tend to draw sharp distinctions between how a philosopher does their job and how they live their life, in ancient Greece and Rome this was much less the case. The philosopher was expected to live their life according to their principles, and accordingly what one did (or was believed to have done) had a real impact on how their philosophy was received. Therefore, Aesop’s life can be seen as an embodiment of the principles he lives by, and vice versa: we can learn about fables through the ”biography” of the person who wrote them, whether or not Aesop ever actually existed. Rather than analyzing the entire text in detail, this article will offer a short summary, and then look in more detail at four especially salient aspects of his life. First, he was said to have begun his life as a slave; second, he is said to have been extremely ugly—as though he were not entirely human; third, he begins his life unable to speak; and, finally, his rise from slavery to greatness also leads to his destruction. As we will see, each of these qualities mark him as being on the boundary between human beings and the other animals that feature so prominently in Aesop’s fables.

Several versions of the Life of Aesop have survived the centuries, and while they have differences, they are the same in broad outline. Aesop, we are told by the unnamed author, was a slave from Samos, a Greek island in the Northern Aegean. He had a number of distinctive traits. He was remarkably ugly, and is frequently compared to animals in terms of his appearance. He was born mute, entirely unable to speak, which is another trait usually associated with animals, who can make sounds but cannot make words or speeches. However, he was also remarkably intelligent and resourceful. This is illustrated by an incident early in the Life in which he is successfully able to defend himself from a false accusation of eating stolen figs by getting the slaves who were the actual culprits to unwillingly reveal their guilt even though he is unable to tell the master what has happened. Aesop does this by drinking warm water and vomiting, which reveals that he had not recently eaten figs. He then gets their master to make the other slaves drink warm water and vomit, which leads to them vomiting up the evidence. He is spared, and they are beaten. He is also pious: One day he helps a priestess of the goddess Isis who has strayed from the road and become lost, and Isis and the Muses repay him for his help by “conferring on him the power to devise stories and the ability to conceive and elaborate tales in Greek.” (The version of the Life used in this article is the one found in Daly’s book referenced below. It is probably the most widely available source of the Life). Shortly after this the slave overseer realizes that if Aesop can speak, he is in a position to convincingly relate the overseer’s abuse of slaves and other wrongdoing to the master. (Since the other slaves, who can speak, have not already reported the overseer, we are already being made aware that there is something exceptional about Aesop’s insistence on being well treated – as though he were a human and not an animal).

The overseer is able to get a slave dealer to pay him a pittance and take Aesop away, but when the dealer takes Aesop to the slave market to sell him, he is at first unable to find a buyer because Aesop is so ugly. The slave dealer is eventually able to sell him, for almost nothing, to the philosopher Xanthus. (There is a connection here, which may be intentional, between Aesop and speaking animals. In Homer’s Iliad at XIX.400, it is a horse named Xanthus who is briefly given the power of speech by Hera in order to reply to Achilles’ demand that his horses do a better job of keeping him from harm than they had done with Patroclus). The next section of the Life describes Aesop’s activities while a slave of Xanthus, and in a number of different episodes Aesop demonstrates that he is in fact wiser than his master—that although he is legally a slave and has no formal education, when it comes to wisdom, cleverness, and proper use of language, the qualities that philosophers like Xanthus claim make them superior to other human beings (and to the animals), Aesop is in fact the master. In all of these episodes, Aesop is not merely showing off his superiority. All of his efforts are turned toward gaining his freedom, but largely due to Xanthus’ arrogance and dishonesty they always fall short. It is not until Xanthus’ fellow citizens call on him to free Aesop so that Aesop may interpret a portent of the future (which Xanthus has promised to interpret before realizing he is not able to do so and being driven to the brink of suicide) before a meeting of the Assembly that Aesop is finally freed (Aesop having helpfully (and ironically) advised them that it is not proper for a slave to address free men in the Assembly). After Aesop correctly interprets the portent, he gains fame and fortune, skillfully solves problems and riddles for famous and powerful figures, and occasionally tells fables along the way. However, in the end it is his very success that leads to his ruin. Although he is successful in his service to the king of Babylon, so much so that the king raises a golden statue in his honor, Aesop decides to travel to Delphi. On the way, he visits many cities and demonstrates his wisdom, receiving payment from cities whose citizens have been impressed by these demonstrations. But when he does the same at Delphi, the people there do not give him any reward for his performance. In return, Aesop mocks the Delphians as being like driftwood, which seems like something worthwhile at a distance but is revealed to be worthless when seen up close. He goes further and tells them that it is not surprising that they are worthless, because their ancestors were slaves (apparently forgetting that he himself was once a slave). The Delphians are outraged by his abuse, hide a golden cup from the temple of Apollo in his luggage, arrest him as he leaves town for allegedly trying to steal it, and sentence him to death. He is unable to persuade them not to kill him, and in the end he is either thrown off of a cliff by the Delphians or, in another tradition, jumps from the cliff himself instead of dying at their hands. The Life ends by noting that the Delphians were afflicted by a famine for killing Aesop and were subsequently punished by the Greeks, Babylonians, and Samians.

What can we take away from this story about what fables are and how they were regarded in ancient Greece? First, it is widely accepted that attributing authorship of the fables to a slave means that the messages of the fables were primarily intended for slaves, or that they were created by slaves, or both. Why would slaves be thought to be particularly appropriate as the creators and audience for animal fables? Two arguments, which are not mutually exclusive, have been put forward. First, many authors have noted that fables allow for the possibility of hidden messages. They allow slaves to tell stories to one another about the cruelty of slavery and how its effects can be mitigated or evaded, without communicating in a way that will get them caught and punished by their masters. The fables can also provide messages about how to successfully survive in a world in which the odds are stacked against you. (Another example of this would be the Uncle Remus stories, which allowed African-Americans to criticize and make fun of whites, as well as share advice about how to survive, without suffering unwanted consequences). Second, it is important to recall that as an ugly slave, unable to speak, Aesop himself is on the boundary between human and animal at the beginning of his life. His slave status would by itself mark him as being on this boundary. Athenians commonly referred to slaves as “boy”—they had no individual identities, like the animals in the fables (and, in fact, slaves were also sometimes called “andropodon,” man-footed animal, related to the word “tetrapodon,” four-footed animal, used to describe cattle). And slaves, like animals, were considered unable to speak in that they had no legal identities—they could not represent themselves in public because speaking in public is a characteristic of human beings (hence Aesop’s insistence that if he is to speak freely to the Samian Assembly in interpreting the portent he must have his freedom). So, fables, which so often feature animals in order to teach lessons to humans, are believed to have been invented by an author who is himself on the border of the animal and the human. It is only once he reaches the pinnacle of fame, wealth, and influence—when he has left his beginnings as almost more animal than human behind and moved from the low end of the human hierarchy to the high end—that he makes the errors in judgment that lead to his death in Delphi. His life story reinforces a significant theme in the fables: that of being unable to change one’s nature and status—although he succeeds for a time, his destruction ultimately comes as a result of these changes. For an example of a fable with a similar message, see Gibbs 327 (Perry 123).

In addition, Aesop’s biography shows us that the fables are related to the animal side of human beings. It is all well and good for Aristotle to suggest that the happiest life is one spent in pure intellectual contemplation or for Plato to tell us that the best life is one spent pursuing knowledge about the Forms of the good and the just and the beautiful, but for most people this kind of philosophy is unavailable, because they do not have the resources to pursue academic philosophy. For some few, linking the human to the divine is an enticing intellectual activity; most of us are closer to the animal than the divine and will benefit more from advice that is framed accordingly. For such people, fables which bring the animal and the human together will be much more valuable than Platonic or Aristotelian philosophy, because fables are focused on practical and embodied philosophy rather than the theoretical and abstract.

3. Aesopic Fable as a Kind of Philosophy

The word “fable” comes from Latin. It ultimately means “story” and is derived from the word fari which simply means “to speak.” Theon famously called it “a false discourse depicting the truth.” Although not all fables are about animals—humans, plants, inanimate objects, and the gods all make appearances—animals certainly predominate, and understanding what fable is requires understanding something about why animals have such a prominent role in them. (Indeed, if we remember that fables were, for a long time, written down on animal skins, it would be fair to say that the ancient fables would not exist if not for animals, either intellectually or physically.)

It’s important to keep in mind that animals were much more important as a part of the life of ancient Greeks than they are for most people in the Western world in the twenty-first century. As they are for many of us today, animals were sources of food and clothing and companionship for the Greeks. However, for the Greeks, they were in addition forms of transportation and conveyance, entertainment, and prestige; they were valued as hunting animals, were used in war, were sources of personal protection, and were an important part of sacrificial rituals linking the human, animal, and divine. Since animals were so deeply involved with their day-to-day physical life, it makes sense that the Greeks would incorporate them into their intellectual life as well. Animals live in a variety of different locations, sometimes in herds and sometimes alone; they engage in a wide range of behaviors and act differently in different settings. Often it would seem to be a simple matter of selecting the right animal in order to evoke a particular understanding of the setting and motivations for the participants in the fable. This allows the author to suggest or imply a lot of backstory in a format which is partially defined by its brevity. So, whereas establishing that a human character is clever might take considerable effort, if the author chooses a fox as one of the characters in the fable, then cleverness is already established as a trait for that character. Similarly, it takes less time to say “this fable is about a mouse” than to establish the timidity of a particular human being.

Of course, stories about animals are only useful lessons for human beings if human beings have traits in common with other animals. For the analogy between human beings and other animals to hold up, human beings must be understood as being a kind of animal themselves. There is a fable that makes this point:

Following Zeus’s orders, Prometheus fashioned humans and animals. When Zeus saw that the animals far outnumbered the humans, he ordered Prometheus to reduce the number of the animals by turning them into people. Prometheus did as he was told, and as a result those people who were originally animals have a human body but the soul of an animal. (Perry 240)

Animals in fable do have one significant difference from animals in the real world as the Greeks saw them: they have the ability to speak, which in the real world is restricted to human beings. (There is disagreement today about whether or not animals can speak, as well as what it means to be able to speak in the first place, but those debates need not concern us here.) Aristotle is perhaps the best-known exponent of this view, as he says in Book 1 of the Politics. Connected to their inability to speak is the inability to reason (the word logos captures both meanings); Aristotle says at Metaphysics 1.1: “The animals other than man live by appearances and memories, and have but little of connected experience; but the human race lives also by art and reasonings.” And at Nicomachean Ethics X.8, he explains that animals do not partake in contemplation and so cannot be said to be happy. Only if someone can make a conscious choice can their actions be in accordance with happiness and virtue (thus Aristotle also indicates that children (and, presumably, slaves) cannot be happy, because they lack the adult ability to make choices). By giving other animals the ability to speak, the fables blur the lines between humans and those other animals, making it easier for humans to learn from the stories fables tell.

With regard to form, fables have a number of distinguishing characteristics: they are usually very short, typically only a few sentences long; they lack any specific setting in time or place; they typically (although not always) involve animals, who are not named or described; the main character acts so as to bring about some outcome, usually through conflict with another character, but often fails to achieve what they intend to do; finally, the character typically makes some kind of a statement acknowledging where they went wrong and accepting the consequences of their error (which can be anything up to and including death). On the one hand, these characteristics limit what the fable can convey. There is no plot, there is no character development, there is typically only one action, and there does not even need to be any dialogue. On the other hand, the characteristics of the form of fable are perfectly suited for widespread oral transmission, which was for centuries the only way in which they were or could be transmitted, and they continued to be transmitted in that way even after the development of widespread literacy, as indeed they still are today. Their simplicity makes them memorable and helps give them their power. Although the fables lack abstraction, they provide a rich stock of philosophical resources for people who are in need of practical philosophical principles to be used in their day-to-day life. The simplicity of the fable is not a sign of the ignorance or limited abilities of the author or the audience; indeed, the opposite is true because creating an effective fable requires stripping the action and language of the story down to the bare minimum needed to convey the truth it seeks to convey.

Lester Hunt says that “though this sort of speech [fable] is not characteristic of philosophy as we know it, that may be because it represents a form of argument that does not seem to be well suited to serve certain purposes that philosophers characteristically pursue, and not because it fails to be an argument” (Hunt 371). In part, it does not serve those purposes because it pre-dates Socrates, who is seen as the first philosopher in the Western tradition, and Plato, who did more than anyone to fix the boundaries of Western philosophy and to define what it was. As presented by Plato, Socrates was deeply interested in the definitions of words. He wanted to know the answers to questions like “What is justice?” and “What is piety?” and these kinds of questions are what many people associate with philosophy to this day. These questions and others like them are indeed not well suited to the form and content of the fables. As has been said, the fables serve to illustrate the consequences of certain kinds of behavior. Their message is practical rather than theoretical, and simple rather than complex. In the Platonic dialogues, Socrates rejects examples of behavior as suitable definitions for words: a list of actions that are just or pious is not the same as a definition of justice or piety, and Plato’s Socrates insists that we cannot reliably come up with examples of a virtue unless we are able to give an accurate definition of what that virtue is. This would seem to exclude fables from the category “philosophy” because they are specific individual examples of behavior and consequences and not concerned with creating systems or defining terms. (It is worth noting here that Socrates himself often uses myths and other stories, such as the Ring of Gyges in Republic, to advance his philosophical arguments.)

But Socrates is only the founder of philosophy if one accepts that philosophy is the thing that Socrates was the first person to do. If one believes, as Socrates apparently did, that one reason to examine one’s life is to be able to be more self-aware, or to live more happily or successfully, then the earlier traditions of wisdom literature such as Aesopic fables which aim at these goals should certainly count. Fables may not be able to tell you about the Form of Justice, but they can suggest some likely consequences of unjust behavior; they may not be able to define Virtue and Vice, but they can give you some examples of what these things look like and suggest for which of the two should be chosen in particular situations and what the outcome of that choice is likely to be. It is true that they are not suitable for complex forms of reasoning or logic, or extended argument—but why should these set boundaries on what we believe philosophy is or does? Hunt adds that: “Because of the limitations of [fable]—that is, that it must be a short, simple narrative making a clear and memorable point that can reach a wide audience—its interest tends to be overwhelmingly practical” (Hunt 379). This does not, however, make fables less philosophical, especially for the Greek audience that they were originally addressed to. Aristotle tells us that the purpose of practical knowledge (by which he means knowledge about ethics and politics) is to enable people to act properly. Leading people to act properly may sometimes require complicated arguments, but it does not mean that only complicated arguments are philosophy.

In addition, fables deliver their messages through analogy, which is a recognized form of philosophical argument. Not every fable does this, but then not every dialogue is a Platonic dialogue—the form allows, but does not compel, philosophical meanings. Perhaps the best starting point for a consideration of how fables worked as analogies can be found in Book II, Chapter 20 of Aristotle’s Rhetoric, where he discusses how they can be used effectively in persuading people to take political action:

Fables are suitable for addresses to popular assemblies; and they have one advantage—they are comparatively easy to invent, whereas it is hard to find parallels among actual past events. You will in fact frame [fables] just as you frame illustrative parallels: all you require is the power of thinking out your analogy, a power developed by intellectual training.

That is, the speaker shows that the situation the assembly currently faces is similar to a situation described by fable, and shows what happens to the characters in the fable, leaving it to the audience to conclude that if they want a different outcome they must act differently than the characters in the story they have just heard (or, if they want the same outcome, they must act in the same way). This requires the audience to actively take part in constructing the argument: they have to analyze the fable, analyze the current situation, determine whether and how they are similar, and come up with a conclusion regarding how they ought to act. The speaker does not tell the listeners how to act; instead, they leave it to the listeners to reach their own conclusions about the right thing to do—which, again, fits with the methods of practical philosophy.

The listeners can then carry the fable with them in their minds—since fables are written to be short and memorable—so that it can be used in other situations. Someone who knows a lot of fables can probably find one to fit any situation—but in order to use the fable effectively, they must be able to choose the appropriate one for the particular situation they are in. For example, is this a situation which calls for determination and persistence, such as that exhibited by the tortoise in the race with the hare? Or is it a situation which calls for someone to recognize that the goal is unattainable and to walk away, as when the fox realizes that the grapes are not within his reach and decides that they must be sour anyway? Again, the fable’s value as an analogy is dependent on the ability of the person using it to properly determine what the appropriate analogy is and what the fable tells that person about the situation they find themselves in. This practice of reflection seems worthy of being described as philosophical activity in this person.

That analogy can be used within other kinds of philosophy and not just fables can be shown with reference to Plato. Socrates also frequently uses analogy as a form of argument, perhaps most famously in the Apology. For example, after he gets his accuser Meletus to say that out of all the Athenians, only Socrates makes the young men worse, and he responds thusly:

I am very unfortunate if that is true. But suppose I ask you a question: Would you say that this also holds true in the case of horses? Does one man do them harm and all the world good? Is not the exact opposite of this true? One man is able to do them good, or at least not many; the trainer of horses, that is to say, does them good, and others who have to do with them rather injure them? Is not that true, Meletus, of horses, or any other animals?

And Meletus agrees. From this Socrates concludes that Meletus is wrong in his accusation of Socrates and is not even taking the trial seriously—anyone who thought things through would easily see that, just as only a few know how to improve horses, only a few would know how to make human beings better. Of course, as many people have noted, this may not be a good analogy. Knowing when an analogy applies and when it does not is an important part of taking it seriously and using it properly. Whether this is a valid analogy or not is not important for our point here, which is that it is a form of argument requiring the listener’s active participation in reaching the correct ethical and political judgment about Socrates’ guilt or innocence. And, of course, in the Republic, Socrates offers his famous cave analogy as a way of explaining the nature of human existence. So Plato is willing to use analogy within the realm of higher philosophy when it seems to be the most effective way to communicate what he is trying to explain.

Perhaps the best statement regarding the content of fables is that of Zafiropoulos, who says that fable offers an “exemplary and popular message on practical ethics and which comments, usually in a cautionary way, on the course of action to be followed or avoided in a particular situation” (Zafiropoulos 1). Practical ethics for the Greeks, as exemplified in the writings of Aristotle, was considered an aspect of politics and political education, so that we can see the fables as not only philosophy but political philosophy, telling people not only how they should live but how they should live together, what to expect from other people if they behave in certain ways, how to have successful social interactions, and so on. In this way the fables can be regarded as similar to Greek plays and epic poetry. Both the plays and the epic poems offer examples of fictional characters conducting themselves in particular ways and the consequences of their conduct so that the audience can learn from their choices (and, most significantly, their mistakes). Raaflaub says with regard to the oral tradition of the epic poems of Homer that “It was important not only to the community but also to the elite to propagate positive patterns of behavior and to illustrate the disastrous consequences of negative ones” (Raaflaub 565). Fables had the same function, while being more accessible to everyone in the community.

4. Philosophical Values in Aesopic Fable

The message (or messages) of a particular fable depend on where it is found. If it is located within a particular story, it will derive its message from the story in which it is found, although even then it may have more than one meaning. If it stands on its own, or is found in a collection of fables, its meaning becomes even more fluid. Nevertheless, if we look at the early fable collections, there do seem to be particular themes that emerge.

Many authors have discussed the themes to be found in the fables; what follows draws on the list found in Morgan, Chapter 3, but similar themes can be found in, for example, Zafiropoulos. Gibbs’ collection of the fables organizes them along thematic lines as well, although her categories differ from the ones given below. Included with each category is an example fable, which will be used to show the way in which the fables generally deal with the topic. Also included is the fable number from Laura Gibbs’ edition of the fables, as well as the Perry number, which is the standard reference number for each fable. The text of each fable is copied from Gibbs’ edition, as found on her website. Taken together, the fables provide a useful set of principles for conducting oneself appropriately according to ancient Greek moral beliefs.

a. The Strong and the Weak

Gibbs 131. The Hawk and the Nightingale
Perry 4 (Hesiod, Works and Days 202 ff.)

This is how the hawk addressed the dapple-throated nightingale as he carried her high into the clouds, holding her tightly in his talons. As the nightingale sobbed pitifully, pierced by the hawk’s crooked talons, the hawk pronounced these words of power, “Wretched creature, what are you prattling about? You are in the grip of one who is far stronger than you, and you will go wherever I may lead you, even if you are a singer. You will be my dinner, if that’s what I want, or I might decide to let you go.”

Perhaps the predominant theme in fable is also the oldest. It can be found in the first recorded fable in the Aesopic fable tradition, from Hesiod’s Works and Days (which significantly pre-dates the supposed dates for Aesop’s life). There is some disagreement about the lesson to be taken from this fable, but it seems clear that the opposition is between the strength of the hawk and the words of the nightingale, who has nothing but words to counter that strength. It is the classic statement of “might makes right,” and those who have little power of their own must necessarily learn this lesson quickly and well. In the poem, Hesiod goes on to claim that the exercise of unjust power is wrong and that Zeus will punish it. Whether or not this is true, it is clear that the thought of future divine punishment will not necessarily deter the strong or protect the weak.

b. Friends and Enemies

Gibbs 70. The Lion and the Mouse
Perry 150 (Ademar 18)

Some field-mice were playing in the woods where a lion was sleeping when one of the mice accidentally ran over the lion. The lion woke up and immediately grabbed the wretched little mouse with his paw. The mouse begged for mercy, since he had not meant to do the lion any harm. The lion decided that to kill such a tiny creature would be a cause for reproach rather than glory, so he forgave the mouse and let him go. A few days later, the lion fell into a pit and was trapped. He started to roar, and when the mouse heard him, he came running. Recognizing the lion in the trap, the mouse said to him, “I have not forgotten the kindness that you showed me!” The mouse then began to gnaw at the cords binding the lion, cutting through the strands and undoing the clever ingenuity of the hunter’s art. The mouse was thus able to restore the lion to the woods, setting him free from his captivity.

The theme here in some ways qualifies the previous example, as sometimes those who seem to be powerless turn out to have more power than one might expect. Although the mouse is weak, the lion’s decision to free the mouse ends up working in his favor in the end, as the mouse repays one kindness with another. There is no way to know in advance who might be able to help you in the future, and so it pays to show kindness and benefit others in the hope of future reciprocity.

c. Intelligence/Foolishness

Gibbs 434. The Man and the Golden Eggs
Perry 87 (Syntipas 27)

A man had a hen that laid a golden egg for him each and every day. The man was not satisfied with this daily profit, and instead he foolishly grasped for more. Expecting to find a treasure inside, the man slaughtered the hen. When he found that the hen did not have a treasure inside her after all, he remarked to himself, “While chasing after hopes of a treasure, I lost the profit I held in my hands!”

Here we have the stereotypical example of foolishness: someone who has a good situation but does not properly appreciate it and, in trying to get still more, loses what they have. Throughout the fables, foolish decisions are punished, often by death. Intelligence, on the contrary, gets a good reputation in the fables. Those who are smart, or at least clever, can turn situations to their advantage—as, for example, in Gibbs 104/Perry 124, “The Fox and the Raven,” in which the fox is able to steal dinner from the raven by the crafty use of flattery. They can also sometimes use their intelligence to find ways to protect themselves from those who have superior power and strength, as in Gibbs 18/Perry 142.

d. Overambition/Failure

Gibbs 342. The Jackdaw and the Eagle
Perry 2 (Syntipas 9)

There was a jackdaw who saw an eagle carry away a lamb from the flock. The jackdaw then wanted to do the very same thing himself. He spied a ram amidst the flock and tried to carry it off, but his talons got tangled in the wool. The shepherd then came and struck him on the head and killed him.

This fable and others like it illustrate the importance of not overreaching. In a society such as the majority of ancient Greek cities, which were extremely hierarchical and which did not allow for social mobility, trying to become more than what one is by nature or birth is a strategy not for climbing to the top but for being destroyed. It is this that arguably destroys Aesop in the Life of Aesop: though a slave by birth, he ends up aspiring to be the adviser of kings, and in the end, his change of status leads him to Delphi and thereby to his death.

e. Truth/Honesty/Lies/Deceit

Gibbs 117. The Wolf and the Sleeping Dog
Perry 134 (Chambry 184)

A dog was sleeping in front of the barn when a wolf noticed him lying there. The wolf was ready to devour the dog, but the dog begged the wolf to let him go for the time being. “At the moment I am thin and scrawny,” said the dog, “but my owners are about to celebrate a wedding, so if you let me go now, I’ll get fattened up and you can make a meal of me later on.” The wolf trusted the dog and let him go. When he came back a few days later, he saw the dog sleeping on the roof. The wolf shouted to the dog, reminding him of their agreement, but the dog simply said, “Wolf, if you ever catch me sleeping in front of the barn again, don’t wait for a wedding!”

This fable provides a nicely Machiavellian lesson about promising one’s enemies whatever is necessary while they have you at a disadvantage and then abandoning those promises when the conditions that made you promise no longer exist. Conversely, the lesson may be that when you are in a position of advantage over an enemy, you should not be too quick to accept their promises about their future behavior.

f. Gods

Gibbs 481. Heracles and the Driver
Perry 291 (Babrius 20)

An ox-driver was bringing his wagon from town and it fell into a steep ditch. The man should have pitched in and helped, but instead he stood there and did nothing, praying to Heracles, who was the only one of the gods whom he really honoured and revered. The god appeared to the man and said, “Grab hold of the wheels and goad the oxen: pray to the gods only when you’re making some effort on your own behalf; otherwise, your prayers are wasted!”

The gods do not appear especially frequently in the extant fables, but when they do appear they are usually there to either reward appropriate conduct (or punish inappropriate conduct), or else to serve to remind people that prayers without effort generally do no good. As the Christian proverb has it, “God helps those who help themselves.” Greek religion provides a wider selection of deities, but reaches a similar conclusion.

g. Reciprocity

Gibbs 167. The Murderer and the Mulberry Tree
Perry 152 (Chambry 214)

A robber had murdered someone along the road. When the bystanders began to chase him, he dropped the bloody corpse and ran away. Some travellers coming from the opposite direction asked the man how he had stained his hands. The man said that he had just climbed down from a mulberry tree, but as he was speaking, his pursuers caught up with him. They seized the murderer and crucified him on a mulberry tree. The tree said to him, “It does not trouble me at all to assist in your execution, since you tried to smear me with the murder that you yourself committed!”

This is an unusual fable in that it features not a talking animal but a talking plant. However, the lesson is not an uncommon one: if you attempt to harm others, they will undoubtedly respond in kind. The fable of the lion and the mouse quoted above would also fit here, as the lion’s kindness is repaid by reciprocity on behalf of the mouse.

h. Women, Family, Love

Gibbs 496. The Thief and His Mother
Perry 200 (Chambry 296)

A boy who was carrying his teacher’s writing tablet stole it and brought it triumphantly home to his mother who received the stolen goods with much delight. Next, the boy stole a piece of clothing, and by degrees he became a habitual criminal. As the boy grew older and became an adult, he stole items of greater and greater value. Time passed and the man was finally caught in the act and taken off to court where he was condemned to death: woe betide the trade of the thief! His mother stood behind him, weeping as she shouted, “My son, what has become of you?” He said to his mother, “Come closer, mother, and I will give you a final kiss.” She went up to him, and all of a sudden he bit her nose, tugging at it with his teeth until he cut it clean off. Then he said to her, “Mother, if only you had beaten me at the very beginning when I brought you the writing tablet, then I would not have been condemned to death!”

Violence and death are commonplace in the fables, but this one is unusual for the graphic depiction of the violence. Nevertheless, it provides a clear example of how mothers ought to behave: they need to provide clear moral guidance to their children (perhaps through the use of instructive fables?), lest the wayward child turn into a criminal as an adult.

5. Conclusion

This article has described what fable is and the characteristics of the man who was allegedly its inventor in order to make the case that the form and content of Aesopic fable as it existed in ancient Greece were philosophical in nature and taught those who learned the fables valuable moral and intellectual lessons for survival. Although fable is not well suited to complicated or abstract arguments, its brevity and use of argument by analogy provides useful food for thought for those who are looking for simple, effective, and memorable moral principles by which to guide their behavior. Fable is therefore well suited to deliver practical life-lessons that can be applied by anyone who is able to think through their situation and draw on the appropriate fable and the lesson that it teaches. In the Greek world, those lessons were oriented toward the day-to-day lives of people who were often in positions of powerlessness and low status, but even for those who were higher on the socioeconomic ladder, fables could provide valuable instruction. In the modern world, as communications become shorter and more immediate (such as Twitter, Facebook, and other social media), we may see a renaissance of the fable form, although of course the lessons it will communicate in today’s world may be very different from those of ancient Greece.

6. References and Further Reading

  • Adrados, Francisco Rodriguez. History of the Graeco-Latin Fable. Vols. 1 and 3. Leiden, NL: Brill, 2003.
  • Arnheim, M. T. W. “The World of the Fable.” Studies in Antiquity 1979–1980, 1–11.
  • Blackham, H. J. The Fable as Literature. London: Athlone Press, 1985.
  • Carnes, Pack. Fable Scholarship: An Annotated Bibliography. New York: Garland Publishing, Inc., 1985.
  • Compton, Todd. Victim of the Muses. Cambridge, MA: Center for Hellenic Studies, 2006.
  • Daly, Lloyd. Aesop Without Morals. New York: Thomas Yoseloff, 1961.
  • Hägg, Tomas. “A Professor and His Slave: Conventions and Values in The Life of Aesop.” In Conventional Values of the Hellenistic Greeks, edited by Per Bilde, Troels Engberg-Pedersen, Lise Hannestad, and Jan Zahle. Aarhus, DKs: Aarhus University Press, 1997.
  • Holzberg, Niklas. The Ancient Fable: An Introduction. Translated by Christine Jackson-Holzberg. Bloomington: Indiana University Press, 2002.
  • Hunt, Lester. “Literature as Fable, Fable as Argument.” Philosophy and Literature 33:2 (2009): 369–385.
  • Katsadoros, George C. “Aesopic Fables in the European and the Modern Greek Enlightenment,” Review of European Studies 3:2, 2011.
  • Kurke, Leslie. Aesopic Conversations. Princeton: Princeton University Press, 2011.
  • Lignell, David. Aesop in a Monkey Suit: Fifty Fables of the Corporate Jungle. New York: iUniverse, 2006.
  • Lissarrague, François. “Aesop, Between Man and Beast: Ancient Portraits and Illustrations.” In Not The Classical Ideal, edited by Beth Cohen, 132–149. Leiden, NL: Brill, 2000.
  • Morgan, Teresa. Popular Morality in the Early Roman Empire. New York: Cambridge University Press, 2007.
  • Nagy, Gregory. The Best of the Achaeans. Baltimore: Johns Hopkins University Press, 1979.
  • Noonan, David C. Aesop & the CEO: Powerful Business Insights from Aesop’s Ancient Fables. Nashville, TN: Thomas Nelson, 2005.
  • Papademetriou, I. -Th. A. Aesop as an Archetypal Hero. Athens: Hellenistic Society for Humanistic Study, 1997.
  • Patterson, Annabel. Fables of Power: Aesopian Writing and Political History. Durham, NC: Duke University Press, 1991.
  • Perry, B. E. Aesopica. Vol. 1. Urbana: University of Illinois Press, 1952.
  • Perry, B. E. Babrius and Phaedrus. Cambridge, MA: Harvard University Press, 1965.
  • Perry, B. E. Studies in the Text History of the Life and Fables of Aesop. Chico, CA: Scholars Press, 1981.
  • Pervo, Richard. “A Nihilist Fabula: Introducing the Life of Aesop.” In Ancient Fiction and Early Christian Narrative, edited by Ronald F. Hock, J. Bradley Chance, and Judith Perkins. Atlanta: Scholars Press, 1998.
  • Plato. Phaedo. Translated by C. J. Rowe. Cambridge: Cambridge University Press, 1993.
  • Plato. Apology. http://classics.mit.edu/Plato/apology.html
  • Raaflaub, Kurt A. “Intellectual Achievements.” In Raaflaub, Kurt A., and Hans van Wees, A Companion to Archaic Greece. New York: Blackwell Publishing, 2009.
  • Short, Jeremy C., and David J. Ketchen Jr. “Teaching Timeless Truths through Classic Literature: Aesop’s Fables and Strategic Management.” Journal of Management Education 29 (2005): 816–832.
  • Van Dijk, Gert-Jan. Ainoi, Logoi, Mythoi: Fables in Archaic, Classical, and Hellenistic Greek Literature. Leiden, NL: Brill, 1997.
  • Winkler, John J. Auctor and Actor. Berkeley: University of California Press, 1985.
  • Zafiropoulos, Christos A. Ethics in Aesop’s Fables: The Augustana Collection. Leiden, NL: Brill, 2001.

 

Author Information

Edward W. Clayton
Email: edward.clayton@cmich.edu
Central Michigan University
U. S. A

Boredom: A History of Western Philosophical Perspectives

The essayist Joseph Epstein has remarked, “Boredom is after all part of consciousness, and about consciousness the neurologists still have much less to tell us than do the poets and the philosophers.”

Although not a major topic for Western philosophers, some important Western philosophers have spoken of it, and regarded it as a major philosophical theme of human life. They have addressed the following issues: (1) what boredom is, which can be taken as the problem of producing an analysis of the concept of boredom, or as the problem of giving a typology of boredom, or as a phenomenology of the experience of it; (2) what to do about boredom, how to overcome it, lessen it, or learn to live with it; (3) what, if anything, the phenomenon of boredom reveals about matters metaphysical or otherwise deep—for instance, God, being, the meaning of life, human nature, the nature of the self, or the nature of some culture or other; (4) what boredom produces, and what it explains; (5) whether and how boredom represents a fundamental mood or “attunement” to the world for a reflective human being; (6) what the conditions are produce boredom, and what sorts of beings tend to feel it; and (7) ethical issues that relate to the phenomena of boring others and being bored oneself.

Why is boredom a philosophical issue? The preceding sketch should indicate how boredom may be regarded not only as a legitimate philosophical issue but as a major one. Moreover, there are several aspects of the problem of boredom which prevent its exhaustive treatment in a straightforward biological, psychological, sociological, or statistical way. There is a problem of identifying what boredom essentially is—a part of which is the problem of determining whether it is one thing or something that comes in a variety of importantly different forms or modes. Whatever scientific studies may be able to contribute to this problem, progress toward its solution will inevitably require contributions from conceptual and phenomenological investigations. There is also the fact, emphasized by Lars Svendsen, that most people have difficulty saying whether they are bored or not, both at the moment and in general throughout their lives—a fact that points to obvious limitations of statistical studies that begin from survey questions about, say, whether people tend to be bored and what bores them, and issue in claims about boredom’s prevalence, objects, typical conditions, cures, and so forth. Finally, it seems clear that if any academic discipline has much to say concerning the metaphysical or ethical implications of boredom, it is more likely to be philosophy than any of the empirical sciences.

The main philosophical texts on boredom are A Philosophy of Boredom by L. Svendsen, Boredom: A Lively History by the classicist P. Toohey, and Fundamental Concepts of Metaphysics by M. Heidegger.

Table of Contents

  1. Ancient and Medieval Times
    1. Solomon
    2. Seneca
    3. Acedia
  2. Early Modern Period
    1. Pascal
    2. Kant
  3. Nineteenth Century
    1. Schopenhauer
    2. Thoreau
    3. Kierkegaard
    4. Nietzsche
    5. James
  4. Twentieth Century
    1. Russell
    2. Heidegger
    3. Bernard Williams
  5. Philosophical Work since the 1990s
  6. References and Further Reading

1. Ancient and Medieval Times

There is a debate among scholars, including philosophers, about how far back in history boredom goes. Several philosophers claim that boredom has always plagued human beings, while others hold that it is peculiarly a malady of the modern world. Those holding the latter view do generally admit, however, that there were pre-modern precursors of boredom. It is with discussion of three of these precursors that this study begins.

a. Solomon

Qoheleth (c.200 B.C.E.), traditionally “Solomon”, author of Ecclesiastes, certainly sounds like he is speaking in large part of something like boredom and that he suffers from the condition himself. What we actually get in Ecclesiastes is nothing like a philosophical analysis of boredom or reflections on any deep implications it might have. Rather, we get expressions of the condition itself, partial identification of its causes or reasons, as well as advice concerning how to reduce it, or anyway how to live a halfway decent life in spite of it.

Expressions of boredom or tedium vitae run throughout the book. “Vanity of vanities, all is vanity.” “The eye is not satisfied with seeing, nor the ear with hearing.” “All I labored to do was vanity and vexation of spirit.” “I hated life because all my work was grievous to me.” These are the sounds a seriously bored (and rather depressed) man makes.

The reasons for boredom in Ecclesiastes seem to be primarily that nothing satisfies and the same old things keep getting repeated, within an individual life, and over countless generations. That which has been is now; and that which is to be has already been. There is nothing new under the sun.

And yet there does seem to be a moral Qoheleth draws.

Go thy way, eat thy bread with joy, and drink thy wine with a merry heart . . . . Let thy garments be always white; and let thy head lack no ointment. Live joyfully with the wife whom thou lovest all the days of the life of thy vanity, which he hath given thee under the sun, all the days of thy vanity: for that is thy portion in this life, and in thy labour which thou takest under the sun. Whatsoever thy hand findeth to do, do it with thy might; for there is no work, nor device, nor knowledge, nor wisdom, in the grave, whither thou goest. (King James Version, 9: 7-10)

That is, live life with gusto, and get enjoyment from what you do, if you can. One might well wonder how effective this advice could be to one truly suffering from a bad case of severe boredom.

b. Seneca

Lucius Annaeus Seneca (4 BCE – 65 CE), the Roman Stoic philosopher, talks about boredom or tedium in his essay, “On Tranquillity,” addressed to his friend “Serenus”, who always seems to need a lot of advice. Based on some things Serenus says, Seneca apparently thinks his friend is on the verge of lapsing into boredom, or at least has gotten himself into a mode of living that leads straight to it. To the modern reader, Serenus’s confessions do not sound like confessions of anything like boredom.

In any event, Seneca takes them as such and proceeds straightway to the following pronouncement.

All are in the same case, both those, on the one hand, who are plagued with fickleness and boredom and a continual shifting of purpose, and those, on the other, who loll and yawn.

Notice here that Seneca includes two central elements in the phenomenon of boredom. On the one hand there is fickleness and restlessness, and on the other a lack of motivation and interest, a weariness that expresses itself in lolling and yawning. His subsequent remarks provide a pretty apt description of the phenomenology of boredom—what the bored person feels, and how he or she is inclined to act (or fail to act). Seneca shall be quoted at length here because of the delightfulness of his prose and the aptness of his portrait of bored people.

[T]hen creeps in the agitation of a mind which can find no issue, because . . . of the hesitancy of a life which fails to find its way clear, and then the dullness of a soul that lies torpid amid abandoned hopes. And all these tendencies are aggravated when men have taken refuge in solitary studies, which are unendurable to a mind that is intent upon public affairs, desirous of action, and naturally restless, because assuredly it has too few resources within itself. When, therefore, the pleasures have been withdrawn which business itself affords to those who are busily engaged, the mind cannot endure home, solitude, and the walls of a room, and sees with dislike that it has been left to itself.From this comes that boredom, dissatisfaction, the vacillation of a mind that nowhere finds rest, and the sad and languid endurance of one’s leisure.

Thence comes mourning and melancholy and the thousand waverings of an unsettled mind, which its aspirations hold in suspense, and then disappointment renders melancholy. Thence comes that feeling which makes men loathe their own leisure and complain that they themselves have nothing to be busy with.

[T]heir mind becomes incensed against Fortune, and complains of the times, and retreats into corners and broods over its trouble until it becomes weary and sick of itself. For it is the nature of the human mind to be active and prone to movement. Welcome to it is every opportunity for excitement and distraction.

Hence men undertake wide-ranging travel, and wander over remote shores, and their fickleness, always discontented with the present, gives proof of itself now on land and now on sea. They undertake one journey after another and change spectacle for spectacle. They began to be sick of life and the world itself, and [think]: “How long shall I endure the same things?”

You ask what help, in my opinion, should be employed to overcome this tedium. The best course would be . . . to occupy oneself with practical matters, the management of public affairs, and the duties of a citizen.

So what do we get from Seneca that will help us in our attempts to understand boredom? We get three things: first, a rather compelling phenomenological account of what the state is like; second, an indication that it can lead to states worse than itself (for example, melancholy, jealousy, and envy); and, third, some advice about how to eliminate or at least ameliorate the condition, namely, through work and immersion in practical affairs.

c. Acedia

Acedia, the “disease that wasteth at noonday” or the demon responsible for the infliction of the disease, was a form of pre-boredom or boredom with sloth that afflicted innumerable practitioners—priests, monks, hermits, and the like—of the religious life in the Christian middle ages. Since our concern here is with philosophical thought on boredom, this fascinating chapter in the book of boredom must be largely passed over. But before it is passed over altogether, it should be noted that there was an ethical overtone to acedia/boredom. The overtone was negative. If God, God’s world, and the life God has ordained for you seem boring to you, there is almost certainly something wrong with your soul, something you had better hasten to fix.

Readers who wish to understand more about acedia should consult the excellent treatment of it in Toohey 2011.

2. Early Modern Period

a. Pascal

But let us move on to the seventeenth century French philosopher Blaise Pascal (1623-1662). With Pascal, we are in the era of history termed “modernity” or “early modernity” or the “beginning of the modern world.” Pascal treats boredom with more depth, passion, and insight than any of his predecessors. He does this in his book, Pensees, especially in section II “The misery of man without God.”

Most of what we get from Pascal are observations of human nature, or of people in general. His primary and oft-repeated point concerning them is that, without diversions and distractions, human beings are naturally bored. Boredom is the natural state of the human being left to his or her own devices.

Diversion is a dominant theme in Pascal’s thought. People cannot live in quiet, peace, and rest with themselves, and so they seek distractions and diversions to draw away their attention from their own empty selves and lives. The diversions do not really work, and so people find themselves returning again and again to perception of the emptiness and nothingness of their own lives, and to a pervasive sense of ennui or boredom, the fit response to their own emptiness and nothingness.

Pascal’s description of the bored and weary person is apt and insightful. He is noteworthy for the claim that boredom and ennui are the natural state of the human being. But his message is not entirely negative. Boredom is the natural state of a human being without God. A life in relation to an infinite God fills the emptiness of the soul and obliterates the restlessness, weariness, and boredom which naturally afflicts people.

b. Kant

Immanuel Kant (1724-1804) speaks of boredom in passing. His remarks about it occur primarily in his Lectures on Ethics. Kant believes that boredom plagues the person who is inactive and has nothing to do. His cure for it is activity, either work or participation in activities of recreation and diversion. The person who just loafs and does not engage in activity can find no rest at the end of the day, while the one who has been active can.

It is interesting to note that Kant’s solution to boredom does not carry the theological overtones that Pascal’s does. Pascal advises one to overcome boredom by establishing a relationship with God; Kant just recommends activity, whether of work or play.

3. Nineteenth Century

a. Schopenhauer

We now come to a philosopher who makes boredom a centerpiece of his philosophy. He is the great German pessimist Arthur Schopenhauer (1788-1870). Several things stand out in Schopenhauer’s treatment of boredom.

First, there is his claim that boredom is one of the twin poles of human life. The other pole is need, want, lack, or desire. Here is the way it works. We feel that we lack something, something we need. We pursue it and, if we are fortunate, capture it. But the capture does not bring the satisfaction we had expected. What we get instead is a strong dose of boredom, and we find ourselves casting about to identify another object of pursuit, somehow convincing ourselves that if we can get it, we will experience satisfaction. Neither want nor boredom is a particularly pleasant state to be in; in fact, both are forms of misery. And so life may be viewed as a pendulum that passes back and forth between one bad state and another.

Second, Schopenhauer offers something like a definition of boredom, a brief analysis of the concept, which may be the first offered in Western thought. Boredom, he says, is a “tame longing without any particular object.”

Third, Schopenhauer offers in addition not just a definition but a substantive account of what boredom is. Boredom, he says, is the sensation of the worthlessness of existence. Boredom may even be regarded as evidence or proof that existence is worthless. If life itself had any real, positive value, there would be no such thing as boredom. Simply being alive would delight us. But, as things are, we can find no modicum of relief from our misery, except when we are diverted or distracted from our lives.

Fourth, Schopenhauer reflects on what boredom or its absence reveals about the intelligence and complexity of the one who suffers from it. His general claim here is that a propensity to be bored is a sign of intelligence. Animals, he speculates, feel very little boredom. Humans are prone to it in proportion to how smart they are. It takes a rich and varied world to hold the interest of a genius, and the real world often doesn’t measure up. As for those who are content with something like mere everyday existence, they are the stupidest of people, not much, if any, above the level of the brutes.

It should be added that there are exceptions in Schopenhauer to his intelligent bored person. One of these is the human being who is lost in the contemplation and enjoyment of art, especially music. The other is the sage, saint, or mystic who has thoroughly denied the will to live and exists in nirvana, or something like it. But very few can even conceive of such a state, let alone achieve it. The vast majority of intelligent people simply have to put up with long stretches of boredom throughout their lives.

Finally, Schopenhauer stresses the seriousness of boredom more than any of his predecessors. It is a form of misery, and a real scourge on the human race. It can lead to the death of the bored one; it can make him or her hang himself or herself. Or, to overcome it, he or she may find himself or herself the instigator of wars, massacres, and murders.

b. Thoreau

The American Transcendentalist philosopher Henry David Thoreau (1817-1865) does not write about boredom as such at any length. But he does make some remarks about it—he calls it “ennui” or “tedium”—which are striking because they are very nearly the opposite of what Schopenhauer claims about the link between boredom and intelligence, and because they provide one answer to a question that would become prominent in the debates about boredom of later philosophers and other scholars.

Thoreau writes, “Undoubtedly the very tedium and ennui which presume to have exhausted the variety and the joys of life are as old as Adam.” Thoreau here anticipates an issue discussed by philosophers more than a century later. The issue is whether boredom is a natural state that has been around ever since there were humans, or whether it developed, or was invented, in the early modern period and is uniquely an affliction of modernity. Thoreau’s answer is that it is as old as Adam.

Moreover, in contrast to Schopenhauer, Thoreau seems to think that it is those who are less intelligent, less mentally active, and more “asleep” who tend to suffer most from boredom. In his Walden essay “Where I Lived and What I Lived For” Thoreau says:

Moral reform is the effort to throw off sleep. Why is it that men give so poor an account of their day if they have not been slumbering? . . . . The millions are awake enough for physical labor; but only one in a million is awake enough for effective intellectual exertion, only one in a hundred millions to a poetic or divine life. To be awake is to be alive. . . . If we respected only what is inevitable and has a right to be, music and poetry would resound along the streets. When we are unhurried and wise, we perceive that only great and worthy things have any permanent and absolute existence, that petty fears and petty pleasures are but the shadow of the reality. This [reality] is always exhilarating and sublime.

Thoreau’s point about boredom is that it is the state of one of limited mental capacities, or of one “asleep”. An intelligent and alert mind is never bored. In its surroundings there are always a thousand things that are fascinating and sublime. The hum of a mosquito, if alertly attended to, is as fascinating and enthralling as an Iliad or an Odyssey.

c. Kierkegaard

The Danish philosopher Soren Kierkegaard (1813-1855) has several things to say about boredom. Four of them will be mentioned here.

First, Kierkegaard shares with Schopenhauer the idea that boredom is quite a serious matter. He claims at one point that boredom is the root of all evil. The consensus among philosophers is that, while there may be something to this, it is a bit of an exaggeration.

Second, Kierkegaard’s conception of boredom is that it is a kind of nothingness, a nothingness that permeates all reality. He calls it “demonic pantheism”—demonic, because it is that which is empty, pantheism because it is all-pervasive.

Third, in spite of (or because of?) its nature of nothingness, boredom functions as a highly effective impetus to action. This strikes Kierkegaard as something of a paradox. He finds it strange that something as staid and solid—staidness and solidity somehow apparently being compatible with nothingness—as boredom could serve as a motivator and stimulus of action. Desire certainly stimulates to action, which is not so puzzling. But boredom is the opposite of desire, not attraction but repulsion. It is some kind of negative stimulus to action. Hence Kierkegaard speaks of boredom’s action-instigating character as “magical”.

Fourth, for Kierkegaard, boredom is a sort of status symbol. It belongs to persons of rank. He writes, “Those who bore others are the plebeians; . . . those who bore themselves are the chosen ones, the nobility.”

d. Nietzsche

The German philosopher Friedrich Nietzsche (1844-1900) nowhere gives an extended treatment of boredom as such, but he does speak of it here and there throughout his writings, and much of what he says about it is thought-provoking. Here are some of his especially interesting points.

First, boredom is part of the explanation of Christian, saintly, or ascetic ideals and practices. These ideals are created, and these practices are followed, largely in an attempt to combat boredom. They are ways to fight it, ways to find a remedy for it. In On the Genealogy of Morals Nietzsche writes:

What do ascetic ideals mean? . . . . [A]mong physiologically impaired and peevish people (that is, among the majority of mortals) they are an attempt to imagine themselves as “too good” for this world, a holy form of orgiastic excess, their chief tool in the fight with their enduring pain and boredom.

He makes the same kind of point in The Antichrist:

In Christianity the instincts of the subjugated and oppressed come to the fore: here the lowest classes seek their salvation. The casuistry of sin, self-criticism, the inquisition of the conscience, are pursued as a pastime, as a remedy for boredom.

Second, boredom explains not only saints and ascetics. It explains virtually everything. In The Antichrist, Nietzsche says that, according to the story in Genesis at the beginning of the Bible, the old God is bored. So he invents man. Man is entertaining to God, but man himself is bored. God then creates animals for him to play with, but they do not entertain him. So, God creates woman.

If he is serious here, Nietzsche implies that, according to the Biblical story anyway, boredom is powerful indeed. It gave rise to the entire human and animal world!

Third, Nietzsche sometimes speaks positively of boredom. He agrees with Schopenhauer that boredom is a sign of vitality and intelligence in the one who has it. Anticipating Heidegger, Nietzsche says that a person who blocks all boredom from his or her life also blocks access to his or her deepest self and the water that flows from its fountain. In another place, Nietzsche makes a claim about boredom that would make it sound like a good thing to many of us, if not to Nietzsche himself. The claim is that, although normally we look away from those who are suffering, we sometimes attend to them and help them in order to rid ourselves of our own boredom.

But, fourth, Nietzsche also speaks of boredom as something we do not want. In Beyond Good and Evil he writes:

[L]et us be careful lest out of pure honesty we eventually become saints and bores! Is not life a hundred times too short for us—to bore ourselves?

Finally, an amusing remark of Nietzsche’s in his late Twilight of the Idols—which, unfortunately may have some truth in it—is worth quoting:

 “What is the task of all higher education?” To turn men into machines. “What are the means?” Man must learn to be bored. “How is that accomplished?” By means of the concept of duty. “Who serves as the model?” The philologist: he teaches grinding.

e. James

The American Pragmatist philosopher and psychologist William James (1842-1910) has a couple of interesting things to say about boredom in “The Perception of Time,” Chapter XV of his massive Principles of Psychology (1890).

First, James tells us what boredom is and the conditions under which it arises. Boredom, he says, is an experience or sensation that “comes about whenever, from the relative emptiness of content of a tract of time, we grow attentive to the passage of the time itself.” When bored, you attend closely to the mere feeling of time per se.

Second, James notes that the experience of boredom is unpleasant, even odious, and he offers an explanation of why that is so. The odiousness of the experience of boredom arises from its insipidity. The feeling of bare time is the least stimulating experience we can have, and stimulation is a necessary component for any pleasure we might find in, or get from, an experience. James quotes with apparent approval the statement of another psychologist who says that “the sensation of tedium is a protest against the entire present.”

4. Twentieth Century

a. Russell

The great British analytic philosopher Bertrand Russell (1872-1970) devotes an entire chapter of his popular book The Conquest of Happiness (1930) to boredom. The chapter is rich in ideas, despite some apparent confusion.

Russell offers a view of what boredom essentially is. “Boredom,” he says, “is essentially a thwarted desire for events.” And besides this thwarted desire, there are two additional essentials of boredom. One is a contrast between present circumstances and some other more agreeable circumstances which force themselves irresistibly upon the imagination. The other is that one’s faculties must not be fully occupied. The opposite of boredom is excitement.

Russell moreover begins the tradition of distinguishing different kinds of boredom (unless we can regard Seneca as the initiator of the tradition). Russell’s distinction is rather odd. The two kinds of boredom are the kind that arises from the absence of drugs and the kind that arises from the absence of vital activity.

There are at least two things in this account that many philosophers would take issue with. Those who think of boredom as a kind of empty longing, or a tame longing without an object, or a desire for a desire, would not accept Russell’s suggestion that in boredom circumstances other than the bored person’s present ones force themselves irresistibly upon his or her imagination. And some philosophers would think it obvious that the opposite of boredom is not excitement but interest. An excited person is no doubt interested in something, but a person whose interest is captured by something need not be excited by anything. It seems wrong to consider a person who is interested in something (for example, the crossword puzzle she is quietly working) as bored just because at the moment there is nothing in her that is exactly excitement.

Russell surmises that boredom (or fear of it and a desire to get rid of it) has been a great motivator throughout human history. It has produced wars, persecutions, quarrels with neighbors, and witch-hunts. Russell even speculates that “more than half” of the sins of humankind have been caused by fear of boredom.

Russell observes that boredom is unpleasant and thus it is natural to want to get rid of it. There is a deep-seated desire for excitement in human beings.

But boredom is not all bad. Some boredom may be a necessary ingredient in life. In fact, Russell says that a certain power of enduring boredom is essential to a happy life. He even claims that:

All great books contain boring portions, and all great lives have contained uninteresting stretches. . . .[A] quiet life is characteristic of great men, and . . . their pleasures have not been of the sort that would look exciting to the outward eye. No great achievement is possible without persistent work, so absorbing and so difficult that little energy is left over for the more strenuous kinds of amusement.

The idea is apparently that great lives require those who live them to endure a lot of quietness and boredom; and the same must be endured by those who make great achievements.

One issue spoken much of in the twentieth century interdisciplinary literature on boredom is that of whether there is more or less boredom in the modern era than in previous times. The usual claim is that there is more boredom in modernity, that is, now. Russell weighs in on this issue and offers the suggestion that there is actually less boredom now than in prior eras. Here is his view. Long ago, in the hunting stage of humanity, there was much excitement and little boredom. Early man was constantly involved in exciting activities that kept him entertained—hunting, fighting, courting, and so forth. But then several centuries ago the agricultural era began, an era that lasted right up to the modern period. Life in the agricultural era was incredibly boring. Work in the fields was generally solitary and repetitive. Life at home in the evenings was as dull as could be. There was no electricity; there were no books, music, or movies; there wasn’t much of anything to do except hunt occasionally for witches. The farmer and his family lived lives of perpetual boredom. But things changed drastically with the coming of the machine age and advances in technology. True, factory workers’ jobs are repetitive and sometimes tedious. But at least the workers usually have company. And during their time off work, there are many things for them to do. Modern life is much less boring than it was for centuries in the agricultural past.

Russell wrote this in 1930. It’s pretty clear that he would say that life is even less boring in the current computer and iPhone age. Most people simply aren’t bored very often these days. They have much excitement and little boredom.

But Russell makes this observation. Although we experience less boredom than our ancestors did, we are more afraid of boredom than they were. They just accepted it; we think that an ideal human life should be completely free of boredom. We tend to think that boredom is not a part of a natural human life.

And so many of us seek one exciting stimulation after another. Russell thinks that this approach to life won’t work. Too much excitement leads to the need for more and more excitement, which results in the end in the inability to be excited at all—and also in the inability to experience many or most of the joys of life. A too zealous quest for excitement leads, paradoxically, to boredom.

Russell isn’t opposed to the pursuit of excitement altogether. If it is engaged in only rarely and then in moderation, it can contribute greatly to the happiness of a life. But in the end Russell recommends a quiet life, one in sync with “the rhythm of the earth”. “A happy life,” he says, “must be to a great extent a quiet life, for it is only in an atmosphere of quiet that true joy can live.”

Russell sometimes speaks as though, in recommending a quiet life, what he is recommending is a life that contains a large amount of boredom. But there may be some confusion on his part here. It seems clear that what he is really recommending is a life that looks boring from the outside, and would be boring to one who needed a lot of stimulation and excitement to be happy, not a life that is boring to the one living it. For the quiet life is one of true joy.

b. Heidegger

The German philosopher Martin Heidegger (1889-1976) discusses boredom at length in his 1929-30 lecture series The Fundamental Concepts of Metaphysics. What Heidegger says there will be the focus of the present summary of his conception of boredom and its significance.

This summary can be little more than a sketch—for several reasons: Heidegger’s treatment of boredom is complex and subject to various interpretations; it cannot be fully understood apart from the vast body of his philosophical work as a whole; and much of it is couched in technical terminology, or in ordinary terms to which Heidegger gives special meanings, which makes it difficult to render into readily intelligible English prose.

But we can say the following things with some confidence. They will necessarily be disjointed. Their relationship is unclear even to many of Heidegger’s close readers.

(1) Heidegger writes about boredom more than any other major philosopher, and perhaps he sees it as having a greater significance than any other major thinker has seen it as having.

(2) The central concept of Heidegger’s philosophy is Dasein, literally, “being there”. Dasein is the kind of being we are. We are beings who are there, in the world. Throughout his long academic career, Heidegger was preoccupied with the question of the meaning of being. In everyday German language the word “Dasein” means “life” or “existence.” Dasein, that being which we ourselves are, is distinguished from all other beings by the fact that it makes an issue of its own being. As Da-sein, it is the location, “Da”, for the disclosure of being, “Sein.”

(3) Fundamental moods or “attunements” figure prominently in Heidegger’s thought. They reveal Being to us. But moods are not to be thought of as mere subjective feelings, inner happenings, or responses to objective facts. A mood is neither internal or external; a mood goes beyond such a distinction and is a basic characteristics of being-in-the-world. It is by way of a mood that we relate to our surroundings. Moods have epistemic as well as merely subjective significance. They reveal the world to us as much or more than our senses do.

(4) Boredom, Langeweile, is a fundamental attunement, a mood. Along with anxiety, it is one of the most important and profound ones. Heidegger makes a distinction between being bored with something and boring oneself with something. The latter is a more profound and useful form of boredom. There may be an even more profound form of boredom. Normally, it is there, in us, but asleep. Heidegger wants to wake it up.

(5) Heidegger wants to awaken boredom rather than let it slumber through various forms of everyday pastime. Boredom, and we ourselves, are asleep in our everyday pastimes in our actual life. We like being asleep. We like lives of slumbering distractions. We seek to be occupied because it liberates us from the emptiness of boredom. But why on earth would we want to wake up, and especially to awaken in a mood as dreary and empty as boredom? Boredom removes an illusion of meaning from things and allows them to appear as what they are: emptiness and nothingness. Who in her right mind would want to remove such an illusion?

(6) Heidegger’s main answer to these questions may be: Boredom prepares the mind for profound vision. Svendsen writes:

By awakening the mood of boredom Heidegger believes we will be in a position to gain access to time and the meaning of being. For Heidegger, boredom is a privileged fundamental mood because it leads us directly into the very problem complex of being and time.

Profound boredom can set us on the road to authenticity. When boredom works its magic, what is left is nothing less than Being itself and its meaning—if it has any. But Dasein is still there, and Being can reveal itself to Dasein.

(7) Heidegger has other answers to the questions raised above (about why one would want boredom and its insights). Here are two of them: (a) accompanying sober boredom is a strange kind of calm joy; and (b) “[p]hilosophy is born in the nothingness of boredom.”

(8) Finally, let us mention three of Heidegger’s points about boredom that have some interest in their own right. They are: (a) normally time is transparent, but in profound boredom we experience time as time; (b) boredom is a mood that in many respects is like an absence of mood; it is indeed a mood, a fundamental attunement, but it is also, paradoxically, a kind of a non-mood; and (c) Heidegger’s answer to the question of what exactly in the world it is that bores us is that it is the Boring. Wrestling with these three claims, wondering if they make sense and, if they do, what sense it is, would make a fit pastime for whiling away a slow slumbering evening.

c. Bernard Williams

The prominent English moral philosopher Williams Bernard ‘s (1929- 2003) famous essay, “The Makropulos Case: Reflections on the Tedium of Immortality,” is not about the nature of boredom as such. Its thesis is, roughly, that it would not be a good thing to live forever, for eventually immortal life would become boring. But in the essay, Williams makes several important points about boredom.

Williams’s conception of boredom is apparent from his language. He thinks of boredom as indifference, detachment, coldness, and inner death.

Long-lasting sameness is what brings it on. Williams explores the issue by reference to his test case, a woman called “EM” in a play Williams uses as a springboard for his reflections. EM had taken an elixir at age 42 which kept her at 42 and continued to do so every year she took it. At the time of the action in the play, EM has been 42 for 300 years, and she is bored to death. She refuses this time to take the elixir, and she dies.  EM’s boredom is connected with the fact that everything that could happen and make sense to one particular human being had already happened to her.

One implication of this is that, in Williams’s view, boredom is a rather serious thing that can motivate, not suicide exactly, but the choice of death over life.

Williams seems to think that EM’s boredom (and consequent choice of death over life) is entirely understandable, proper, and fitting.

In all this there is the idea, stated more or less explicitly at certain points, that there are circumstances in which one ought to be seriously bored. Not being bored suggests an impoverishment in one’s consciousness of her circumstances. Williams writes that, “not being bored can be a sign of not noticing or not reflecting enough.” So, Williams may think that sometimes we have a moral, or more broadly ethical, or still more broadly human, reason to be bored. There are times when we ought to be bored, and not being so represents some kind of ethical, intellectual, or human failure.

5. Philosophical Work since the 1990s

It is impossible to do justice here to current writing on boredom. Work on it by philosophers (and others) is thriving. The major philosophical work, as has been mentioned, is Svendsen’s Philosophy of Boredom. Other important works are: Toohey’s Boredom: A Lively History; the Routledge Boredom Studies Reader: Frameworks and Perspectives, edited by M. E. Gardiner and J. J. Haladyn, a rich anthology by a diverse group of contributors exploring multiple issues, many of them philosophical, that the phenomenon of boredom raises; W. O’Brien’s “Boredom” in the 2014 volume of the journal Analysis; and several works on boredom by Andreas Elpidorou, probably the most prolific and certainly one of the most interesting of the writers on the subject at present. Only two of the issues in the current debate will be mentioned here.

First, there is discussion about the question of whether there really is such a thing as “existential” or “profound” boredom (as distinct from everyday situative boredom, whose reality nobody questions). Svendsen, following Heidegger, argues that existential boredom is indeed real and important. Toohey, in contrast, denies that there is any such state, arguing that what has been misidentified as such is actually a particular form of depression. Although there are exceptions, “analytical” philosophers tend to side with Toohey on this matter while “continental” philosophers tend to side with Svendsen. (Perhaps this is due to the great influence of Heidegger on continental thought, an influence largely absent in analytical circles.)

Second, an analysis of the concept of boredom has been proposed by Wendell O’Brien in the journal Analysis. O’Brien suggests that boredom is an unpleasant or undesirable mental state of weariness, restlessness, and lack of interest in something to which one is subjected, a state in which the weariness and restlessness are causally connected in some way to the lack of interest. The extent to which O’Brien’s analysis is satisfactory remains to be seen. Those on the Toohey side of the disagreement just mentioned are likely to accept something like it. Those on the Svendsen side are likely to find it deeply problematic, though they may concede that some variation of the sort of analysis O’Brien proposes might capture the notion of everyday situational boredom.

If the reader wishes to pursue study of this current work, it is suggested that he or she begin by consulting post-1990 sources listed in the “References and Further Reading” below.

6. References and Further Reading

  • Ecclesiastes. KJV.
  • Elpidorou, A. 2014. “The Bright Side of Boredom,” Frontiers in Psychology 5: 1245.
  • Elpidorou, A. 2016. “The Significance of Boredom: A Sartrean Reading,” in Philosophy of Mind and Phenomenology: Conceptual and Empirical Approaches, ed. D. O. Dahlstrom, A. Elpidorou, and W. Hopp, pp. 268-285. London: Routledge
  • Frankfurt, H. 1999. “On the Usefulness of Final Ends,” in Necessity, Volition, and Love, 82-94. Cambridge and New York: Cambridge University Press.
  • Gardiner, M. E. and J. J. Haladyn, eds. 2016. Boredom Studies Reader: Frameworks and Perspectives. London: Routledge.
  • Healy, S. D. 1984. Boredom, Self, and Culture. Madison, N. J.: Fairleigh Dickinson University Press.
  • Heidegger, M. 1995. The Fundamental Concepts of Metaphysics, W. McNeill & N. Walker trans.  Bloomington: Indiana University Press.
  • James, W. 1890. The Principles of Psychology. (Now in the public domain and readily accessible online.)
  • Kant, I. 1963. Lectures on Ethics, Louis Infield trans. London: Harper & Row.
  • Kierkegaard, S. 1992. Either/Or, Alastair Hannay trans. & abridged. London: Penguin Books.
  • Lombardo, N. E. 2017. “Boredom and Modern Culture,” Logos: A Journal of Catholic Thought and Culture, 20: 2, 36-59.
  • Millgram, E. 2004. “On Being Bored Out of Your Mind,” Proceedings of the Aristotelian Society, New Series, Vol. 104, pp. 165-186.
  • Nietzsche, F. 1954. The Portable Nietzsche, W. Kaufmann trans. & ed. New York: Viking Penguin.
  • Nietzsche, F. 1968. Basic Writings of Nietzsche, W. Kaufmann trans. & ed. New York: The Modern Library, Random House.
  • O’Brien, W. 2014. “Boredom,” Analysis 74:2, 236-243.
  • Pascal, B. 1958. Pens‎ees, W. F. Trotter trans. New York: E. P. Dutton.
  • Raposa, M. L. 1999. Boredom and the Religious Imagination. Charlottesville: University of Virginia Press.
  • Russell, B. 1930. The Conquest of Happiness. London: Liveright.
  • Schopenhauer, A. 1970. Essays and Aphorisms, R. J. Hollingdale trans. London: Penguin Books.
  • Seneca, L. 1917. Epistles: Vols. IV-VI. Loeb Classical Library, R. M. Gummere, trans. Cambridge, MA.: Harvard University Press.
  • Spacks, P. M. 1995. Boredom: The Literary History of a State of Mind. Chicago: The University of Chicago Press.
  • Svendsen, L. 2005. A Philosophy of Boredom, John Irons trans. London: Reaktion Books.
  • Thoreau, H. 1983. Walden and Civil Disobedience. New York: Penguin Books. (Originally published in 1854.)
  • Toohey, P. 2011. Boredom: A Lively History. New Haven: Yale University Press.
  • Williams, B. 1973. “The Makropulos Case: Reflections on the Tedium of Immortality,” in Problems of the Self, pp. 81-100. Cambridge: Cambridge University Press.
  • Yao, V. 2015. “Boredom and the Divided Mind,” Res Philosophica, 92:4, 937-957.

 

Author Information

Wendell O’Brien
Email: w.obrien@moreheadstate.edu
Morehead State University
U. S. A.

Mind and the Causal Exclusion Problem

The causal exclusion problem is an objection to nonreductive physicalist models of mental causation. Mental causation occurs when behavioural effects have mental causes: Jennie eats a peach because she wants one; Marvin goes to Harvard because he chose to, etc. Nonreductive physicalists typically supplement adherence to mental causation with the view that behavioural effects have distinct sufficient physical causes as well: Jennie eats a peach because the muscles in her arms contracted as a result of the innervations of muscle fibres, which were in turn caused by the release of neurotransmitters from the motor neurons at the neuromuscular junction, and so on and so forth. Nonreductive physicalists, therefore, argue that behavioural effects have sufficient physical causes and distinct mental causes. The causal exclusion problem is the leading objection to this view, and it is based on the causal exclusion principle, which stipulates that events cannot have more than a single sufficient cause. The causal exclusion principle conflicts with the nonreductive physicalist view that behavioural effects have a sufficient physical cause and a distinct mental cause. Critics typically add that the sufficient physical cause of the behavioural effect excludes the mental cause of the same effect, so nonreductive physicalism also fails to secure mental causation.

Various responses to the causal exclusion problem have been suggested. Some overcome the causal exclusion problem by undermining certain metaphysical foundations supporting the problem. For example, some adopt differing models of events and properties, thereby avoiding the thrust of the causal exclusion problem. Others turn to differing models of causation to dissipate exclusion pressures. There are those who resolve the causal exclusion problem by providing robust nonreductive physicalist models of mental causation. These models include supervenience based nonreductive physicalism, emergentism, functionalism, and the realization strategy. Each of these models attempts to provide an account of how the causal exclusion problem does not defeat their model. Still others respond to the causal exclusion problem by rejecting one of the principles undergirding the causal exclusion problem. For example, the epiphenomenalist rejects the principle of mental causation, the interactionist dualist rejects the principle of physical causal completeness, the reductionist abandons the principle of irreducibility, and the compatibilist rejects the principle of causal exclusion. These views must demonstrate the viability of rejecting one of these widely accepted principles.

This article introduces and motivates the causal exclusion problem, and considers the merits and demerits of these avenues of response to the causal exclusion problem. The stakes are high. Not only does the causal exclusion problem pose difficulties with reconciling autonomous agency with neuroscientific advances that increasingly establishes neural causes of behavioural effects, but it also threatens to leak out into other domains. This article closes with a discussion of the possibility that the causal exclusion problem applies to the realm of explanation as well, and the viability of the view that the causal exclusion problem may generalize to other special science disciplines such as sociology, economics, geology, and biology.

Table of Contents

  1. Introduction
  2. The Causal Exclusion Problem
    1. The Principle of Mental Causation
    2. The Principle of Physical Causal Completeness
    3. The Principle of Irreducibility
    4. The Principle of Causal Exclusion
  3. Solutions to the Causal Exclusion Problem
    1. The Metaphysics of Events and Properties
    2. The Metaphysics of Causation
    3. Supervenience
    4. Emergentism
    5. Functionalism
    6. Realization
    7. Epiphenomenalism and Autonomy
    8. Interactionist Dualism
    9. Reductionism
    10. Compatibilism
  4. Explanatory Exclusion
  5. The Generalization Problem
  6. Conclusion
  7. References and Further Reading

1. Introduction

While disputes about mental causation arise in ancient philosophy, the locus classicus of the problem of mental causation is René Descartes. Descartes argued that the mind is a thinking substance that is distinct from the body, which is an extended substance. Descartes supplemented this substance dualism with a principle of interactionism, according to which the mind causally interacts with the body. For example, Jennie’s wanting a peach, which is distinct from the physical processes in her brain, causes her to eat a peach. Descartes’ interactionist dualism faced numerous difficulties. Princess Elizabeth of the Palatinate and others argued that thinking substance, which is not extended in space, cannot come into causal contact with the extended body. Henry More and others argued that a distinct thinking substance that causally interacts with a body would violate conservation principles by increasing the motion of the universe.

Contemporary discussion on the problem of mental causation typically begins with the Type Identity Theorists of the mid-twentieth century. They argued that mental states are type identical with causally efficacious physical states, thereby securing mental causation. For example, Jennie’s wanting a peach is a causally efficacious physical process in her brain, so Jennie’s peach eating has a mental cause, which is the sufficient physical cause. The type identity theory faced numerous difficulties as well. Hilary Putnam and others argued that mental properties are multiply realizable, so they cannot be identical with specific physical properties. David Chalmers and others argued that mental states have qualitative or intentional properties that are irreducible to physical processes.

The failure of type reductionism led to the currently dominant, nonreductive, and physicalist solution to the problem of mental causation. They argue that behavioural effects have sufficient physical causes and distinct mental causes. For example, Jennie’s peach eating has a mental cause that supervenes upon a distinct sufficient physical cause of the behaviour. In recent years, this nonreductive hegemony has been likewise threatened. The causal exclusion problem is the principal weapon fashioned against nonreductive physicalist solutions to the mental causation problem. The causal exclusion problem is most thoroughly exposited in a series of articles and books by Jaegwon Kim (Kim, 1998; Kim, 2005). He argues in favour of the causal exclusion principle, which states that: “No single event can have more than one sufficient cause occurring at any give time…” (Kim, 2005, 42). Accordingly, Jennie’s peach eating cannot have a mental cause and a distinct sufficient physical cause, as nonreductive physicalism posits. As a result of this causal exclusion problem, nonreductive physicalist solutions to the mental causation problem are currently threatened.

2. The Causal Exclusion Problem

In brief, the causal exclusion problem amounts to the difficulty of establishing the nonreductive physicalist view that behavioural effects have sufficient physical causes and distinct mental causes, over and against the plausibility of the view that the sufficient physical cause of the behaviour excludes the mental event from causally influencing the behaviour.  If mental events are excluded from causally influencing behavioural effects, nonreductive physicalism fails to secure mental causation—a shortcoming which is probably fatal. More formally, according to a common though not universal presentation, the causal exclusion problem is the conjunction of the following four individually plausible, but (seemingly) jointly inconsistent principles.

a. The Principle of Mental Causation

The first of the four principles constituting the causal exclusion problem is the principle of mental causation. Broadly construed, the principle of mental causation stipulates that some events have mental causes:

The Principle of Mental Causation: some events have mental causes.

This initial definition is subject to considerable nuance, including the following three distinctions. First, there are questions about whether mental causes should be construed as substances, events, or properties of events. Traditional models of substance dualism, most famously espoused by René Descartes, suppose that mental causes are substances, such as a soul or a disembodied mind. Most contemporary philosophers reject this view in favour of the the view that mental phenomena are events. However, there are deep, yet relevant, disagreements about the nature of events, and whether events, in virtue of certain properties, are causally efficacious. These issues will be dealt with in detail in Section 3.a.

Second, there are questions about whether the events that have mental causes are mental effects, physical effects, or both. Some philosophers endorse the autonomist view, according to which mental events cause mental effects but do not cause physical effects (Gibbons, 2006). Devin’s sadness is caused by his belief that his gecko died, but his crying has physical causes. This view will be dealt with in Section 3.g.

Third, many philosophers distinguish between autonomous mental causation and reduced mental causation. Nonreductive physicalists endorse autonomous mental causation, which is the conjunction of the principle of mental causation and the principle of irreducibility. Autonomous mental causation is the view that the mental-as-mental causes effects. Reductive physicalists endorse reduced mental causation, which is the conjunction of mental causation with a rejection of irreducibility. Reduced mental causation is the view that the mental-as-physical causes effects. Most, even some reductionists (Kim, 2005, 159), agree that autonomous mental causation would be preferable, though the result of the causal exclusion problem may be that autonomous mental causation is not possible within a physicalistic metaphysic.

There are numerous arguments in support of the principle of mental causation. First, Donald Davidson overthrew the consensus against mental causation by highlighting the plausible distinction between having a reason for acting, and acting for a reason (Davidson, 1963). A student may have a desire to impress the teacher as a reason for asking a question, but the student may actually ask the question because he wants to know the answer. This plausible distinction presumes that reasons are causes, which supports the principle of mental causation. Second, the moral responsibility argument: an ought implies a can, and a can implies mental causation. Someone locked up in chains, literally unable to move, is not morally responsible for not helping someone who has fallen. Similarly, if humans are unable to act, since they lack mental causation, they lack moral responsibility (Kim, 2005, 9). Third, the epistemic argument: knowing implies the justification relation between premise and conclusion, or sensation and belief, is a causal relation. Imagine a random syllogism generator that spits out a million invalid syllogisms in a row: monkeys like bananas, tomatoes are red, therefore the sky is blue. Then, once it finally gets lucky: humans are animals, animals are mortal, therefore humans are mortal. There is justified true belief here, but not knowledge. What is missing? Among other things, the conclusion does not occur because of the reasonableness of the premises, where because is taken literally—it must be the cause (Brewer, 1995, 242). Finally, an evolutionary argument (Jackson, 1982, 133): organisms typically inherit traits that enhance fitness, so, probably, mental events enhance fitness. If mental events lack causal efficacy, they do not enhance fitness. So, probably, mental events are causally efficacious. For these reasons, many think the principle of mental causation must be taken as a “truism” (Ney, 2007, 486), whose rejection would amount to “the end of the world” (Fodor, 1989, 77).

b. The Principle of Physical Causal Completeness

The second of the four principles constituting the causal exclusion problem is the principle of physical causal completeness:

The Principle of Physical Causal Completeness: every physical event that has a cause has a sufficient physical cause.

This principle is subject to several possible modifications. First, the principle of physical causal completeness is stated deterministically. If it turns out that the completed microphysics is indeterministic, it would be easy to reframe this principle in terms of a probabilistic model of microphysics. For example, every physical event has its probability fixed by entirely physical antecedents (Papineau, 1993, 22; Bennett, 2008, 281). Nothing of substance rides on adopting the deterministic or probabilistic version, but the deterministic reading is often used for the sake of simplicity. Second, this principle, as presently defined, merely stipulates that physical events have sufficient physical causes, but does not require that mental events have sufficient physical causes. Thus, it is possible that physical events have sufficient physical causes, but mental events do not. This possibility is ruled out by the addition of a strong supervenience principle, which, for the purposes of the mental causation debate, can be defined as follows:

The Principle of Supervenience: physical events determine every mental event, and every mental event depends upon physical events.

As a general example of supervenience, imagine a picture of Mona Lisa printed out by a dot-matrix printer. The Mona Lisa depends upon the dots on the page, and the dots on the page determine that the Mona Lisa arises. Likewise in the case of mental causation: Jennie’s desire for a peach is determined by, and dependent upon, some series of neural events in Jennie’s brain. The supervenience principle, combined with the principle of physical causal completeness, says that not only does every physical event have a sufficient physical cause, but every event, including mental events, is determined by physical events. This can serve as an adequate definition of physicalism:

Physicalism: every physical event has a sufficient physical cause, and every event is determined by, and depends upon, physical events.

Not only do many physicalists supplement the principle of physical causal completeness with a supervenience principle, but some also strengthen the principle of physical causal completeness into a principle of physical causal closure. The principle of physical causal closure indicates that physical events only have sufficient physical causes. On physical causal completness, distinct mental causes are not definitively barred from causally interacting with physical events (Kim, 2009, 38; Montero, 2003, 174). That is, one can admit that every physical event has a sufficient physical cause, while continuing to add a mental cause for the event as well (Marcus, 2005, 19ff; Crane and Mellor, 1990, 206). This possibility is closed off by stipulating that physical events only have physical causes (Vicente, 2006, 150; Kim, 2005, 50; Montero, 2003, 175). Most, however, do not endorse this stronger principle of physical causal closure, since it appears to exclude nonreductive physicalist models of mental causation (Kim, 2005, 52; Lowe, 2000, 572).

The principle of physical causal completness is supported by two arguments. First, the appeal to conservation laws. There are a number of sources that extensively discuss the historical ascension of conservation laws in modern physics (Harbecke, 2008, 19ff; Papineau, 2001, 13ff). In brief, Descartes introduced the law of the conservation of motion, according to which the total mass times speed of any set of bodies remains constant. Descartes maintained, however, that the mind could alter the direction of bodies without altering their speed. Leibniz however, established the law of the conservation of linear momentum, according to which the total mass times speed and direction of any set of bodies remains constant, regardless of how they interact. Leibniz argued that this conservation law closed the physical world off from mental causes. Several centuries later, Hermann von Helmholtz added the law of conservation of energy: the total energy, or force, of any system of interacting bodies is conserved, or, remains the same across time. The result of these conservation laws is that every physical event has a sufficient physical cause. Those endorsing physical causal closure also use this argument to yield the stronger conclusion that every physical event has only a sufficient physical cause, as distinct mental causes cannot add energy to a closed physical system. Physical causal completeness is also supported by the success of neuroscience. In the past one hundred years, neuroscientists have successfully mapped neuronal processes responsible for a wide range of behavioural effects. Brain regions associated with mental states such as emotions, cognitive capacities, and perceptual capacities have been discovered. While neuroscience is not yet complete, these findings provide increasingly compelling evidence that every behavioural effect has a sufficient physical cause. For these reasons, many think that the principle of physical causal completeness is “fully established” (Papineau, 2001, 33).

c. The Principle of Irreducibility

The third principle constituting the causal exclusion problem is the principle of irreducibility, according to which mental causes of behavioural effects are distinct from physical causes of behavioural effects.

The Principle of Irreducibility: mental causes of behaviour are distinct from physical causes of behaviour.

The principle of irreducibility, like the previous two principles, has different readings. Some take the principle of irreducibility to mean that mental properties are distinct from physical properties, though mental events are identical with physical events (Davidson, 1993, 3; Fodor, 1974, 100). That is, a brain event in Jennie’s brain has neural properties, such as its cascading neural activity, and distinct mental properties, such as a felt desire for a peach. On this view, mental causation typically occurs when the brain event, in virtue of its mental properties, causes behavioural effects. Others take the principle of irreducibility to mean that mental properties are distinct from physical properties, and mental events are distinct from physical events (Kim, 2005, 42). In this case, Jennie’s neural activity is a distinct event from her felt desire for a peach. On this view, mental causation typically occurs when the mental event causes behavioural effects. This distinction is central to a strategy for overcoming the causal exclusion problem, as discussed in Section 3.a.

There are two leading arguments in support of the principle of irreducibility. First, Leibniz’ doctrine of the indiscernibility of identicals stipulates that if two entities are identical they share all the same properties. Thus, a moose is not identical with a bear if the moose has antlers but the bear does not, and mental causes are not identical with physical causes if mental causes have distinct properties from physical causes. Some argue that mental events are subjective experiences such as itches or pains while physical events are objective chemical interactions (Chalmers, 1996). Others argue that mental events are purposive, intentional, and rational, while chemical activity in the brain is not (Silberstein, 2001, 85). Still others contrast free agency with physical determinism. The principle of irreducibility is also supported by the multiple realizability argument (Putnam, 1967), according to which mental properties can be realized by a number of different physical property instances. For instance, the same hunger for fish is realized by different neural activity in humans and sharks. Since a self-identical property must always be present wherever it is, hunger cannot be identical with a specific neural activity in humans, since hunger is also present where this specific neural property is absent. Some philosophers add that the same mental event can be multiply realized over time as well (Pereboom, 2002, 503). This way, Jennie’s belief that cheerios are yummy has persisted since she was three, despite slight alterations to the neural correlates constituting her belief over time. For these reasons, many take the failure of the principle of irreducibility to be “simply inconceivable” (Slors and Walter, 2002, 1), as evidenced by reductionists who acknowledge the “grip” of the “compelling intuition” (Papineau, 2002, 3).

d. The Principle of Causal Exclusion

The final principle constituting the causal exclusion problem is the principle of causal exclusion. In its contemporary formulation, the causal exclusion principle is introduced by Norman Malcolm (1968), but is most thoroughly exposited by Jaegwon Kim. Here is Kim’s articulation of the principle:

The Principle of Causal Exclusion: “No single event can have more than one sufficient cause occurring at any given time—unless it is a genuine case of causal overdetermination” (Kim, 2005, 42).

It is worth highlighting several important features of this definition. First, the causal exclusion principle comes with the caveat that an event can have more than one sufficient cause if the event is genuinely overdetermined. Genuine overdetermination occurs when two independent causal processes converge on the same effect—the house burns down because the lit match drops in the garbage at the same time as the lightning strikes the house. The nonreductive physicalist posits that mental events supervene on physical events, so there are not two independent causal processes, so behavioural effects are not genuine cases of overdetermination. As a result, the causal exclusion principle directly opposes the nonreductive physicalist view that behavioural effects can have sufficient physical causes and dependent mental causes.

The causal exclusion principle specifies that events cannot have more than one sufficient cause occurring at a given time. This caveat is added in order to set aside instances involving causal chains where a is a sufficient cause of b, and b is a sufficient cause of c, thereby indicating that a is also a sufficient cause of c. It is acceptable for c to have both a and b as sufficient causes in this way. But, since the nonreductive physicalist argues that mental events supervene on physical events, these events occur simultaneously, so the causal exclusion principle applies straightforwardly.

The causal exclusion principle can be interpreted as stating that behavioural events cannot have two sufficient causes (Arnadottir and Crane, 2013, 254). This interpretation allows for the following trivial solution to the causal exclusion problem: behavioural effects have one sufficient physical cause and a distinct but insufficient mental cause, which does not violate the causal exclusion principle. Indeed, the principle of mental causation does not stipulate that mental events must be sufficient causes, so this solution would be available. The alternative reading of the causal exclusion principle rules out this scenario by stating that behavioural events cannot have a single sufficient cause and any other cause, partial or sufficient (Kim, 2005, 17).

While the causal exclusion principle rules out more than a single sufficient cause of an effect, it does not rule out the possibility that the effect has a sufficient physical cause and a non-causal determinant. This is an especially poignant note, since physical events non-causally determine supervening mental events. Thus, the following trivial solutions to the causal exclusion problem are available: mental effects have sufficient mental causes, and are determined by subvening physical events (Thomasson, 1998, 183-186); physical effects have sufficient physical causes, and are determined by distinct mental events. These moves are repelled in two ways. First, by appealing to a broader principle of determinative exclusion, according to which effects can have no more than a single determinant, causal or otherwise (Kim, 2005, 17). Second, by appealing to Edward’s Dictum, which says sufficient synchronic determination relations exclude causal relations (Kim, 2005, 36ff). For example, the existence of a university at a time is what ultimately determines that Dr. Smith is a professor at the university at that time, not the fact that she got tenure two years ago. Similarly, subvening physical bases determine mental events, thereby excluding prior mental events as causes of those mental events.

There are four arguments in support of the causal exclusion principle. First, the massive coincidence argument: multiple sufficient causes of the same effect is a rare coincidence—barns infrequently burn down by the simultaneous occcurrence of a lightning strike and a dropped match. Yet, mental causation is ubiquitous—agents perform acts based on reasons hundreds of times a day and there are billions of agents in the world. Thus, the view that behavioural effects have more than a single sufficient cause is a view that stipulates massive amounts of coincidence, which is implausible (Kim, 1998, 53). Second, the parsimony argument: according to the venerable principle of parsimony, one ought not multiply causes beyond necessity, where necessity is eclipsed once sufficient causation is established (Kim, 1989, 98). Third, the necessity argument: overdetermining causes are individually sufficient, so not individually necessary. Billy and Suzy both throw stones, simultaneously breaking the window. Billy’s throw is individually sufficient—his throw alone, without Suzy’s, would have broken the window—so Suzy’s throw is individually unncessary. If behaviour is likewise overdetermined, neither cause is individually necessary. But, physical causal completeness insists that some physical cause is necessary, and mental causation requires a mental cause (Moore, 2017). Fourth, the additivity argument: if causation involves production, and one sufficient cause packs all the punch required to produce the effect, then a second cause would push the effect too far, or be incapable of producing the effect at all, on account of the fact that the effect has already been fully produced (Carey, 2011, 253; Kim, 1998, 53).

To briefly summarize, each of the four principles constituting the causal exclusion problem are substantially motivated. But, it is difficult to imagine how one can consistently endorse all four principles. How can one agree that behavioural effects can have no more causes than the single sufficient physical cause, while simultaneously arguing that behavioural effects nevertheless have distinct mental causes as well? The nonreductive physicalist endorses the first three principles, leaving the fourth principle as an objection to the nonreductive physicalist view. The next section canvasses a variety of resolutions to this causal exclusion problem.

3. Solutions to the Causal Exclusion Problem

Numerous responses to the causal exclusion problem have arisen. Some solutions modify the metaphysical foundations underlying the causal exclusion problem (3.a-3.b). Others propose models of nonreductive physicalism that secures mental causation (3.c-3.f). Still others reject one of the four principles constituting the causal exclusion problem (3.g-3.j).

a. The Metaphysics of Events and Properties

The causal exclusion problem is grounded in a set of metaphysical assumption. Some critics undermine the causal exclusion problem by refuting or simply rejecting these metaphysical assumptions. One such strategy focuses on the nature of events and properties presumed by the causal excluion problem. Kim frames the causal exclusion principle in terms of events: no single event can have more than a single sufficient cause. Likewise, the principle of physical causal completness indicates that physical events have sufficient physical causes, while the principle of mental causation states that events sometimes have mental causes. Clearly, the nature of events is central to the causal exclusion problem.

Kim endorses the property exemplification model of events, according to which an event is the instantiation of a property by an object at a time (Kim, 1976). Sebastian’s stroll at noon, or Brutus’ stabbing at sunset, are prototypical events as are the brain’s neural process at dawn and Joe’s pain at dawn. While each event has one constitutive property, events can have other properties as well. Sebastian’s stroll at noon has the constitutive property of ‘being a stroll’, while it also has the properties of being long and winding. Events are identical if they have the same object, constitutive property, and time. Thus, Sebastian’s stroll at noon is identical with the man’s stroll at 12:00 PM, but is not identical with Grace’s stroll at noon, or with Sebastian’s sleep at noon, or with Sebastian’s stroll at sunset. These identity conditions on events entail that a mental event is only identical with a physical event if, among other things, the mental property of the event is identical with the physical property of the event (Kim, 2005, 42). This is called the single-instantiation thesis, and a number of authors agree with it (Whittle, 2007, 64; Gibb, 2004, 469). The implication is that one cannot yoke event identity with property dualism.

Cynthia MacDonald and Graham MacDonald modify the property exemplification model in a manner that opens up a solution to the causal exclusion problem (MacDonald and MacDonald, 2006). They note that the same event can be the instantiation of both a constitutive property and numerous other properties as well. Sebastian’s stroll is the same event as Sebastian’s walk at noon, and Sebastian’s moving at noon, and Sebastian’s exercising at noon. As they say, “there can be just one instance of distinct properties” (MacDonald and MacDonald, 2006, 562). This co-instantiation thesis, applied to the mental causation debate, suggests that the same event can be an instance of a physical property and a distinct mental property. Or, one can yoke event identity with property dualism. This affords the following solution to the causal exclusion problem: mental causes are identical with sufficient physical causes. This means there are not two sufficient causes, and mental properties cannot be reduced to physical properties.

This solution not only rests upon the successful modification of Kim’s framework, but also faces a version of the quausal problem (Honderich, 1982, 63-64; Sosa, 1984, 277; Kim, 1984, 267)). The quausal problem stipulates that events are caused by virtue of causally relevant properties. For example, while the heavy, green pear causes the scale to tip to one pound, it is in virtue of the pear’s heaviness that the scale tips to one pound, not in virtue of the pear’s greenness. Likewise, while the event causes behavioural effects, it is plausible that the event, by virtue of its constitutive physical properties, rather than its mental properties, causes behavioural events. MacDonald and MacDonald avoid the quausal problem by suggesting that events cause as ontological simples. That is, it assumes that the event that is a physical instance is the event that is a mental instance, and this event does not cause in virtue of it being a physical instance or mental instance, but it causes in virtue of being an event (MacDonald, 2007, 243). Some worry that this response would allow every property instanced as the event to be causally efficacious (Wyss, 2010, 174).

Donald Davidson goes a step further than MacDonald and MacDonald, rejecting the Kimian framework of events entirely (Davidson, 1980, 163ff). For Davidson, events are ontologically simple. Events are not instantiations of constitutive properties, nor do they have other properties as ontological constituents. Thus, the building’s falling at noon is not, essentially, a falling, so it can truly be re-described using non-equivalent language, such as ‘the event reported about on pg. 5 of the Times’, or ‘that fateful event’. Or, the neural process can be truly described using physical vocabulary or non-equivalent mental vocabulary, such as ‘Jennie’s desire for a peach’. This amounts to event identity: the mental event is the physical event, yoked together with conceptual dualism, and the physical description is irreducible to the mental description (Davidson, 1980, 207ff). This model solves the causal exclusion problem as follows: mental causes are identical with sufficient physical causes, so there are no more than a single sufficient cause, while mental predicates are irreducible to physical predicates.

Critics typically level the quausal problem against this Davidsonian solution (Honderich, 1982). Again, the quasal problem suggests that events cause in virtue of causally relevant properties: while the fleecy, pink slippers provide warmth, it is in virtue of their fleecyness, not their pinkness, that they provide warmth. Likewise, while mental causation may be secured by the fact that the mental event is the efficacious physical event, mental quausation fails by virtue of the fact that events cause in virtue of their lawlike physical properties, not in virtue of their mental properties. Davidson responds by stating that events cause as events, no matter whether the events are described in physical vocabulary or mental vocabulary (Davidson, 1993). Most philosophers are unsatisfied with Davidson’s response, as they find it plausible that events cause effects through causally relevant properties (Kim, 1993b).

b. The Metaphysics of Causation

The causal exclusion problem is, prima facie, a problem pertaining to causation. The principle of mental causation implies that there are mental causes, while the principle of physical causal completeness implies that there are physical causes that are sufficient causes. At the same time, the principle of causal exclusion stipulates that no effect can have more than a single sufficient cause. Questions about the nature of causation are paramount, and numerous solutions to the causal exclusion problem advert to modifying the metaphysics of causation upon which the problem rests.

Jaegwon Kim crafts the causal exclusion problem from within a productive model of causation, according to which “a cause is something that produces, or generates, or brings about its effects, something from which the effects derive their existence or occurrence” (Kim, 2007, 235). This means that causes push, pull, strike, transfer momentum or energy, or in some other way produce their effects. The additivity argument for the causal exclusion principle is largely motivated by the productive model of causation. If the physical cause packs all the punch required to produce the effect, then a distinct mental cause would push the effect further or harder or be incapable of producing the effect at all on account of the fact that the effect has already been fully produced.

Numerous philosophers undermine the causal exclusion problem by attacking this productive model of causation. They argue, for example, that productive notions of causation do not appear in contemporary physics (Loewer, 2007). Emboldened by these considerations, numerous critics resolve the causal exclusion problem by operating within different models of causation. For example, the nomological model of causation stipulates that causes nomologically necessitate their effects. Fire is a cause of smoke since there exists a law such that ‘if fire occurs, then smoke occurs’. On this view, mental events are causes of behavioural effects if there exists a law such that ‘if the mental event occurs, then the behavioural event occurs’. This law can be established, as the physical event, which necessitates the behavioural effect, also necessitates the occurrence of the mental event. So, all things being equal, the presence of the mental event necessitates the occurrence of the behavioural effect (Fodor, 1989, 66). Thus, behavioural effects can have nomologically sufficient physical causes and nomologically sufficient mental causes.

Critics level several objections at this nomological causation solution. First, the nomological model may fail as an account of causation. Numerous events stand in nomological relations with other events without being causes of those events (Kim, 2007, 231): the gun’s sound nomologically necessitates the hole in the wall, the knife’s shadow nomologically necessitates the gash in the screen. Similarly, mental events may be like shadows that nomologically necessitate without causing behavioural effects. One can object to this shadow analogy on the grounds that mental causation stipulates that mental events are efficacious whereas shadows are not. A more appropriate analogy may be where fire causes Joe’s death, but fire necessitates the appearance of  smoke, which is clearly also a nomologically sufficient cause of Joe’s death. However, given that the causal exclusion principle stipulates that there cannot be two sufficient causes of behavioural effects, those committed to the causal exclusion principle will not allow two nomologically sufficient causes of behavioural effects.

Some philosophers resolve the causal exclusion problem by appealing to a counterfactual model of causation. According to the counterfactual model, effects are counterfactually dependent on their causes. Fire is a cause of smoke since there is a counterfactual dependency such that ‘Had the fire not occurred, the smoke would not have occurred’ (Lewis, 1986, 166-167). Counterfactual dependency is established if the nearest possible world where fire does not occur is also a world where smoke does not occur. On this view, mental events are causes of behavioural effects if the nearest possible world where the mental event is absent is a world where the behavioural effect is absent as well. This counterfactual dependency is established by virtue of the fact that the nearest possible world where the mental event does not occur is also, given supervenience, a world where the subvening physical event does not occur, which indicates the behavioural effect does not occur either (Loewer, 2007; Kroedel, 2015).

This accounts for mental causation, but establishing the truth of physical causal completness is more complex on the counterfactual account. Some demonstrate that the behavioural event has a physical cause by appealing to the truth of the counterfactual ‘Had the physical event not occurred, the behavioural effect would not have occurred’ (Loewer, 2002). While this establishes that behavioural effects have physical causes, physical causal completeness requires that behavioural effects have sufficient physical causes. Unfortunately, the counterfactual model lacks a clear criterion for sufficient causation. Here is one possibility: the physical event is a sufficient cause of the behavioural effect if the nearest possible world where the physical event occurs without the mental event is a world where the behavioural effect occurs. The difficulty with this possibility is that nonreductive physicalists typically say that the physical event metaphysically necessitates the mental event, so there are no worlds where only the physical event occurs (Loewer 2002, 658; Bennett 2003, 479; Kallestrup, 2006, 472), so it is not established that the physical event is a sufficient cause. Perhaps this difficulty is avoided by saying that the physical event is a sufficient physical cause of the behavioural effect if the nearest possible worlds where the physical event occurs are also worlds where the behavioural effect occurs (Menzies, 2013, 63; Kroedel, 2015, 366).

The counterfactualist solution to the causal exclusion problem faces additional difficulties as well. Like the difficulty facing nomological accounts, some complain that counterfactual dependency is established among non-causal events: had the knife’s shadow not occurred, the wound would not have occurred, but the shadow is not a cause of the wound; had the gun’s sound not occurred, the hole in the wall would not have occurred, though the sound is not a cause of the hole (Kim, 2007, 234). Similarly, mental events may be like shadows that behaviour counterfactually depends upon, but nevertheless makes no causal contribution.

Some philosophers have recently deployed a difference-making model of causation to solve the causal exclusion problem (List and Menzies, 2009, 482). The difference-making model, which bares similarities with a blend of the nomological account and the counterfactual account, says that causes must ‘make a difference’ for their effects. The fire makes a difference for the smoke if ‘Had the fire occurred, the smoke would have occurred’ is true, and if ‘Had the fire not occurred, the smoke would not have occurred’ is true. Mental events cause behavioural effects, since ‘Had the mental event occurred, the behaviour would have occurred’ and ‘Had the mental event not occurred, the behaviour would not have occurred’ are both true. Physical events are sufficient causes of behavioural effects, because ‘Had the physical event occurred, the behaviour would have occurred’ is true. But ‘Had the physical event not occurred, the behaviour would not have occurred’ is false, because the mental event could have a different physical realizer, in which case the behaviour would still have occurred. So, the physical event is a sufficient cause, securing physical causal completeness; meanwhile, the behaviour has no more causes than the distinct mental cause, which satisfies the causal exclusion principle while sustaining distinct mental causes of behaviour.

Critics level several objections at this difference-making solution. First, since the physical event is not a cause of the behaviour, physical causal completeness may fail (Bermudez and Cahen, 2015, 53). While some take this issue to defeat the model, others take it to be a virtue of the model, since it turns the causal exclusion problem on its head by establishing that mental causes exclude physical causes of behavioural effects (Menzies, 2015, 39-40). It is also possible to avoid the failure of physical causal completness by arguing that behavioural effects are realization-sensitive. That is, if the occurrence of a different physical realizer yields a different behavioural effect, then the nearest worlds where the physical event does not occur are worlds where the behavioural effect does not occur. This establishes that the physical event is a cause of the behavioural effect. But, this solution now violates the causal exclusion principle, as it postulates more than a single sufficient cause of the behaviour.

Some philosophers appeal to the related interventionist model of causation to solve the causal exclusion problem (Woodward, 2003). On interventionism, fire is a cause of smoke if we intervene on fire, or, bring it about that fire does or does not happen, while holding all other variables constant, and the result is that the smoke does or does not happen. On this view, mental events are causes, since, when one intervenes on the mental event, the behavioural event shifts. For example, changing a mental event from a belief that carrots are healthy to a belief that carrots are poisonous would eliminate carrot eating behaviour. Likewise, the physical event is a cause of the behaviour, since taking the physical event away would take the behavioural effect away. The result: the interventionist model articulates a manner in which behavioural effects legitimately have more than a single sufficient cause, thereby falsifying the principle of causal exclusion.

Some resist this interventionist solution to the causal exclusion problem on the grounds that it is impossible to establish mental causation on interventionism. Interventionist mental causation requires intervening on the mental event while holding all other events, including the subvening physical event, fixed. But, by virtue of supervenience, it is impossible to intervene on the mental event without also altering the subvening physical event, so interventionist mental causation is not established (Baumgartner, 2010). Some respond by stipulating that the special nature of the supervenience relation entails that one need not – since one can not – hold the physical event fixed while intervening on the mental event. As such, there is no difficulty establishing mental causation within an interventionist framework (Woodward, 2015).

c. Supervenience

 

The most longstanding model of nonreductive physicalist mental causation is mind-body supervenience. According to this view, mental events supervene upon, or are determined by, physical events. This notion of supervenience implies a tight enough dependency relation for mental events to inherit the efficacy of their subvening bases. For example, Jennie’s desire for a peach causes her to eat a peach, while the subvening physical cause also causes her to eat the peach (Zangwill, 1996). It is worth noting that Kim once appealed to the supervenience relation to secure mental causation (Kim, 1993, 106). It is common to depict the situation with the aid of the following diagram, where m stands for a mental cause, p is a physical cause, m* is a mental effect, p* is a physical effect, horizontal and diagonal lines indicate causation, while vertical lines indicate supervenience:figure 1

Diagram 1: Supervenient Mental Causation

According to this diagram, p is a sufficient cause of p*, but p necessitates the presence of m, ensuring that m is present to cause p* and m* as well.

The causal exclusion principle is fashioned to directly confront this supervenience model of mental causation. Indeed, sometimes the causal exclusion problem is called the supervenience argument, indicating that supervenience based solutions are clear targets of the causal exclusion principle. The causal exclusion principle stipulates that events cannot have more than a single sufficient cause, where p* has two causal arrows converging on it, so one of the causes must be excluded. The principle of physical causal completeness stipulates that p must be a sufficient physical cause of p*, so m is excluded as a cause. Furthermore, m* is fully determined by its supervenience base p*, so m is excluded as a cause of m* (Kim, 2005, 39ff). There is no work for the mental event to do, which undermines the principle of mental causation, thereby calling supervenience-based nonreductive physicalism into question.

d. Emergentism

The difficulties associated with the supervenience solution led many philosophers to explore fresh methods of accommodating a sufficient physical cause and a distinct mental cause of the same effect. Chief among these new methods is renewed interest in the doctrine of emergentism (O’Connor and Wong, 2005; Humphreys, 1997). Emergentists argue that novel properties emerge, or arise, out of a base level. At the base level, microphysical particles arrange in such a way as to compose, and give rise to, higher level mereological wholes such as molecules or molecular compounds. The molecule has novel properties, or, properties that its particles, in isolation, lack. Molecules then arrange in such a way as to compose higher level biological wholes such as cells, hearts, reproductive mechanism, and organisms. These biological wholes have novel properties as well, such as the ability to pump an organism’s blood, to reproduce or to chew. Similarly, some biological wholes, specifically, brains, give rise to persons with novel mental properties such as beliefs and desires. And, persons, when appropriately grouped together, compose social structures and institutions with novel properties that the persons, in isolation, lack—for example, being a university professor, or being the baseball team’s shortstop.

Higher level emergent properties are capable of downward causation, which means that emergent properties influence lower level domains. Thus, the puppy-wise organization of bones and muscles influences these bones and muscles—if the bone was not arranged as a puppy jaw, the bone would not be able to bite down or move. Emergent properties secure mental causation by exercising downward efficacy on lower-level behaviours. The emergentist conceives of mental properties as emergent properties that arise out of, and are not reducible to, its neural parts. Hence, the principle of irreducibility is secured. At the same time, emergent mental properties arise out of, and are dependent upon, lower level physical properties, which may establish the physicalist view that all events are dependent upon physical events.

It is common to object to the emergentist solution to the mental causation problem on the grounds that emergent properties supervene upon their bases. This is problematic because, as discussed above, supervenient properties are susceptible to exclusion pressures. In other words, since the lower level bases are sufficient causes of behavioural effects, supervening emergent properties are excluded from causing behavioural effects (Kim, 1999; McLaughlin, 1997, 16).

Some emergentists respond by rejecting the view that emergent properties are supervenient properties, hence rejecting the view that the same exclusion pressures facing supervenient properties face emergent properties (Silberstein and McGeever, 1999). Other emergentists concede that emergent properties lack downward causation. They endorse what is sometimes called weak emergentism, or epistemic emergentism, according to which higher level descriptions cannot be explained by, and are not predictable from, lower level descriptions of the same phenomenon (Bedau, 1997). This view bares certain affinities with conceptual dualism yoked with ontological monism. Other emergentists respond by abandoning, or significantly nuancing, the principle of physical causal completeness (Hendry, 2010). Because emergent properties have novel, downward causal influence on behaviour, the lower level parts must not be sufficient causes of behaviour. However, higher level wholes, such as persons, rabbits, and mountains, are still broadly physical (Kim, 1997, 293), so higher level causes are still broadly physical causes. Thus, behavioural effects have sufficient physical causes, where sufficient physical causes includes both the lower level microphysical processes and the higher level broadly physical processes. This position, while it may secure a principle of broad physical causal completeness, seems to abandon microphysical causal completeness.

e. Functionalism

Functionalism is a dominant model of nonreductive physicalist mental causation that construes mental properties as functional properties of the mind (Witmer, 2003; Block, 1990). Jennie’s belief that it will rain is defined as her being in whatever state is caused by rain-indicating perceptual inputs and, given a background psychology that is familiar with rain, causes her rain preparation behaviour. These functional properties in turn have various physical realizers which carry out the specific task defined by the causal role. Jennie’s belief that it will rain is realized by some neural structure in her brain. Functionalism secures irreducibility and mental causation by defining mental events by their causal profiles, which are distinct from their physical realizers which implement the causal profile and cause behaviour.

The causal exclusion problem poses difficulties for this functionalist model of mental causation. Since, according to physical causal completeness, the physical realizer does all the work in bringing about behavioural effects, the mental state as defined by an abstract causal role is excluded from efficacy (Block, 1990, 155; Kim, 1998, 51). For example, the pill’s functional property of ‘being dormitive’ is realized by some chemical property of the pill, where the chemical does all the work in producing sleep in patients, leaving the dormitivity of the pill with no work left to do.

Functionalists respond to the causal exclusion problem in several ways. Some functionalists, including Kim, endorse realizer functionalism, which is the view that functional states are identical with their efficacious realizers, thereby inheriting the causal efficacy of their physical realizers. Thus, Jennie’s belief that it will rain is identical to the specific neural structure in her brain that realizes the functional role specified as the belief. This view secures mental causation via ontological reductionism, while it typically also endorses property dualism. For example, while the dormitive properties of one pill is secobarbital, the dormitive properties of another pill is phenobarbital, though not all dormitivity is realized by secobarbital. This functional reductionist position will be discussed in Section 3.i. Other functionalists endorse role functionalism, which is the view that functional states are distinct from their efficacious realizers. These functionalists must explain how functional properties play a role over and above the role played by their realizers. One currently viable possibility is the realization approach, as detailed in Section 3.f.

f. Realization

The realization strategy agrees with the functionalist view that mental properties are realized by physical properties. They add, however, that the causal powers associated with the realized mental property are distinct from the causal powers associated with the realizing physical property. Typically, the causal powers associated with the mental property are taken to be a proper subset of the causal powers associated with the physical property (Shoemaker, 2011; Wilson, 2011). For example, the causal powers associated with pain include the disposition to produce winces and groans, while the causal powers associated with pain’s realizer, C-Fibre firing, includes the disposition to produces winces and groans as well as other capabilities, such as the disposition to slightly tip sensitive scales and the disposition to nourish hungry lions. This model preserves irreducibility, since the causal powers associated with the mental property are more limited than, and hence distinct from, the causal powers of the physical property. This model secures mental causation by noting that it is the causal powers associated with the mental property that is causally efficacious in bringing about the behavioural effect. This model secures physical causal completeness since the mental property is realized by, hence is nothing over and above, the physical property instance that causes the effect. And, this solution does not violate the causal exclusion principle, as parts do not compete with their wholes for causal efficacy—a salvo of shots fired at Smith does not exclude the single arrow in that salvo that strikes Smith as a cause of Smith’s death (Shoemaker, 2007, 64).

The realization strategy faces several difficulties. Of central concern is whether tightly related but distinct tokens compete for causal efficacy. The suggestion is that the mereological relation is so tight that exclusion pressures do not arise. This is similar to the view that the supervenience relation is so tight that exclusion pressures do not arise. In Section 3.j., the compatibilist will similarly argue that the exceeding tightness of the relation dodges exclusion pressures. As discussed in the case of supervenience, however, the advocate of the causal exclusion principle will not be convinced that exclusion pressures are avoided in these cases. A second worry with the realization strategy is whether tokens realize both that subset of causal powers that is the mental property and the complete set of causal powers that is the physical properties. If the causal powers associated with pain includes the disposition to produce winces, and the causal powers associated with C-Fibre firing includes many things including the disposition to produce winces, and both of these causal powers are realized in the same instance, there seems to be a double counting of causal powers (Audi, 2012, 661). This double counting problem seems to re-introduce worries about overdetermination. It is possible to avoid this difficulty by positing token identity with property irreducibility (Wilson, 2011). That is, while mental properties are distinct from physical properties by virtue of their distinct causal profiles, realized mental properties are identical with their realizing physical instances. This move, however, like realizer functionalism, salvages mental causation at the price of abandoning token irreducibility. A third worry is that Shoemaker’s solution takes mental properties to be parts of physical causes of behavioural effects. Since wholes depend on their parts, physical causes of behavioural effects would be dependent upon mental properties, which may not be consistent with physicalism (Pineda and Vicente, 2017).

g. Epiphenomenalism and Autonomy

The epiphenomenalist resolves the causal exclusion problem by abandoning the principle of mental causation while endorsing the principles of physical causal completeness, irreducibility, and causal exclusion. Thus, effects can have only their sufficient physical causes, leading to the view that distinct mental events are not causally efficacious (Robinson, 2006; Gadenne, 2006). A common argument in support of epiphenomenalism is the joint strength of the principles of physical causal completeness, irreducibility, and causal exclusion, which together imply that mental causation fails.

The challenge for the epiphenomenalist is to overcome the argumentation supporting the principle of mental causation. They typically do this in two ways. First, they argue that, due to supervenience, the physical cause of behavioural effects necessitates the presence of a mental event as well. Thus, while mental events are not causal, they necessarily precede behavioural effects (Robinson, 2004, 165). Jennie’s eating of the peach is preceded by her desire for a peach, even though her desire is not a cause of her eating. This is a non-causal account of the common sense fact that the appropriate mental event precedes behavioural effects. Critics typically point out that epiphenomenalists do not think the physical event metaphysically necessitates the mental event. This makes it possible for the physical cause of Jennie’s peach eating to have given rise to a desire for broccoli instead, thereby breaking the link between the mental event and the appropriate behavioural response (Pauen, 2006).

Some endorse a weakened version of epiphenomenalism called the autonomy solution. While epiphenomenalism states that mental events lack causal efficacy tout court, the autonomy solution states that mental events do not causally interact with physical events, but do causally interact with mental effects (Gibbons, 2006). Jennie’s desire for a peach does not cause her to eat peaches but does cause her to believe she desires a peach. This move secures physical causal completeness and irreducibility while simultaneously establishing that behavioural effects are not overdetermined, and that mental events can cause mental effects. Critics argue that this solution leads to the unfortunate result that Jennie’s pain causes her to believe she is about to scream but does not cause her to actually scream (Dennett, 1991, 403). This drawback may be avoided by endorsing a wider autonomist solution, according to which mental events cause both mental effects and behavioural effects, but do not cause microphysical effects (Zhong, 2014, 349-350). This solution, however, like all autonomist solutions, faces worries that subvening microphysical processes determine behavioural effects and mental effects, thereby excluding mental events from causing mental effects or behavioural effects (Kim, 2005, 36-37).

h. Interactionist Dualism

The interactionist dualist solves the causal exclusion problem by accepting the principles of mental causation, irreducibility and causal exclusion, which jointly leads to the falsity of the principle of physical causal completeness. Because distinct mental events cause behavioural effects and behavioural effects are not overdetermined, the physical cause of the behaviour is not a sufficient cause (White, 2017; Meixner, 2008). Again, the plausibility of the three endorsed principles provides support for the conclusion that physical causal completeness is false.

This model must overcome the arguments in support of physical causal completeness. Some do so by providing models according to which mental causation is ‘invisible.’ This means that while mental events are causally efficacious, the efficacy of mental events is not detectable at the physical level (Lowe, 2008, 74; Gibb, 2015). One will not find gaps in the physical causal process that mental events must fill in, so behavioural effects have sufficient physical causes, so physical causal completeness is true. While this solution may not violate physical causal completeness, it does violate physical causal closure, as physical effects do not have only physical causes. This solution also faces exclusion pressures—the behavioural effect has a sufficient physical cause, thereby excluding purported distinct mental causes.

i. Reductionism

The reductionist solves the causal exclusion problem by rejecting the principle of irreducibility, leaving them open to embrace the principles of mental causation, physical causal completeness, and causal exclusion (Kim, 2005, 101). Thus, mental causes are identical with sufficient causes of behavioural effects, thereby establishing that behaviour has no more than a single sufficient cause. It is common to argue in support of reductionism by appeal to the joint strength of the other three principles: because behaviour cannot have more causes than the sufficient physical cause, the only way for mental causation to be true is to identify mental events with physical events.

The challenge is to demonstrate how the identity can be sustained, given the distinctions between mental and physical events, and the multiple realizability of the mental. Reductionists typically argue that the appearance of distinction is explained by the fact that the same event can be known by direct qualitative experience and by third-person description. Kim avoids the multiple realizability issue by emphasizing event reductionism: Jennie’s hunger is identical to an increase in a specific type of ghrelin in her gastrointestinal tract, while the shark’s hunger is identical with some other physiological state. While some supplement event reductionism with property dualism, this move is unavailable to Kim since event identity implies property identity on his model of events. Kim concludes that the property of hunger exists only as a functional concept (Kim, 2010, 207ff). The increase of a specific type of ghrelin in Jennie’s gastrointestinal tract is actually an instance of the property of being an increase of a specific type of ghrelin, and this event can be truly described using the functional concept of hunger as well. Critics worry that this view amounts to ontological monism yoked with conceptual dualism, which is troublesome because Kim has previously argued against the Davidsonian model of event identity with conceptual dualism (Moore and Campbell, 2015). It also seems that events must cause in virtue of their physical properties, not in virtue of their mental properties, since mental properties do not exist, so mental quausation fails.

j. Compatibilism

Compatibilism, as coined by Terence Horgan (1997, 166), is the view that endorses the principles of mental causation, irreducibility, and physical causal completeness, thereby disputing the causal exclusion principle in some manner (Bennett, 2003; Shoemaker, 2007). Thus, there is some benign way of showing how behavioural effects can have sufficient physical causes and distinct mental causes. Support for compatibilism arises from the plausibility of the three endorsed principles, which constitutes evidence that the causal exclusion principle is false. Compatibilists argue that that there obtains an exceedingly tight relation between the physical cause and mental cause. As discussed in previous sections the exceedingly tight relation may be a relation of strong supervenience, or realization, or some other such relation in which the physical cause metaphysically necessitates the mental cause. As it is impossible for blue to occur without a colour occurring, and it is impossible for a horse-wise arrangement of horse parts to occur without a horse occurring, so it is impossible for the physical cause to occur without the mental cause.

This exceedingly tight relation is deployed as a resolution to the arguments supporting the causal exclusion principle. The massive coincidence involved with two independent causal processes converging on the same effect is replaced with the requirement that physical causes and their dependent mental causes must converge on behavioural effects (Loewer, 2002, 658). This tight relation implies that the physical cause necessitates the mental cause, so mental events are necessary for behavioural effects, overcoming the necessity argument. Likewise, the mental event guarantees the presence of some physical cause, so physical causes are necessary for behavioural effects (Kallestrup 2006, 472; Arnadottir and Crane 2013, 255), overcoming the necessity argument as well. Similarly, the parsimony argument stipulates one should not countenance more causes than necessary, but both the physical cause and the mental cause is necessary.

The compatibilist view faces a number of difficulties. Some worry that the compatibilist solution is ad hoc as the only instances of dependent overdetermination in nature appear to be those very instances of mental and physical dependent overdetermination that compatibilists suggest (Pineda, 2002). Compatibilists reply that ubiquitous, naturally occurring part-whole relations are also dependently overdetermined (Arnadottir and Crane, 2013, 258). The boxer’s knuckles and fist both strike the punching bag, simultaneously causing the punching bag to move; the baseball and the baseball’s parts both cause the window to shatter. Secondly, while compatibilists argue that exclusion pressures dissipate once the dependency relation between physical events and mental events is established, critics argue that the exclusion principle precisely applies to only those situations in which there are two dependent causes of the same effect (Kim, 2005, 48). Moreover, while compatibilism establishes that mental events are necessarily present prior to behavioural effects, it is not clear that the mental event is a cause of the behavioural effect. It is possible, for example, that the mental event is like a necessarily present epiphenomenal shadow that does no causal work. This leaves the sufficient physical cause as the single sufficient cause of the behavioural effect. Compatibilists reply by stating that the mental event is not akin to an epiphenomenal shadow, but rather is a cause of the effect. However, the more the compatibilist insists that the behavioural effect necessarily has a mental cause, the more difficult it is to show that the physical cause is an individually sufficient cause of the effect (Moore, 2017, 36).

4. Explanatory Exclusion

The causal exclusion principle has a “companion principle” (Kim, 2005, 17) in the realm of explanation called the principle of explanatory exclusion. The principle of explanatory exclusion states: “There can be no more than a single complete and independent explanation for any one event” (Kim, 1988, 233). In fact, it is of historical worth to note that Kim’s inaugural articulation of the exclusion problem occurs in the context of excluding superfluous explanations (Kim, 1988; Kim, 1989). It is important to note that the principle of explanatory exclusion allows for dependent explanations in excess of the complete explanation of the same event, while causal exclusion specifically bans dependent causes in excess of the sufficient cause of the same event. The viability of this dissimilarity is questionable: if distinct but dependent explanations of the same event are permissible, why are distinct but dependent causes of the same event not permissible? Or, if distinct but dependent causes of the same event are not permissible, why are distinct but dependent explanations of the same event permitted?

Like the causal exclusion principle, the explanatory exclusion principle is supported by a parsimony argument (Kim, 1989, 98): explanations should not be multiplied beyond necessity, where one complete explanation of an event is necessary, so additional independent explanations of the same event can be excluded. The explanatory exclusion principle is also supported by appeal to explanatory realism, which says that explanations track objective relations and that these objective relations are the content of explanations (Kim, 1988, 226). Because there can be no more than a single sufficient cause of events, and explanations track objective relations, there can be no more than a single complete and independent explanation of events.

The explanatory exclusion principle poses problems for models claiming that the same event has a complete physical explanation and an independent mental explanation, which includes the popular model of ontological monism yoked with conceptual dualism (Davidson, 1993; Papineau, 2002). Jennie’s increased heart rate has a complete physical explanation in terms of a release of hormones from her amygdala, but this same event is also explained by her fear of the approaching bear. Since the physical explanation is a complete explanation, the intensionally independent mental explanation can be excluded as unnecessary—an unpalatable result.

Numerous responses to this explanatory exclusion problem have been proposed. First, some reduce mental explanations to physical explanations by endorsing an extensionalist model of explanatory individuation (Kim, 1988, 233). The physical explanation of Jennie’s increased heart rate refers to the same causal relation as the mental explanation of Jennie’s increased heart rate. And, the causal relation is the content of both explanations; therefore,  the explanations state the same thing, so there is really only one explanation. Explanatory exclusion pressures only arise when there are two explanations of the same event, so explanatory exclusion pressures do not arise. This response is accused of endorsing a counterintuitive model of explanatory individuation, whereby two clearly distinct explanations are considered the same explanation. For example, ‘The earthquake caused the collapse of the building’ does not seem to state the same explanation as ‘The event that caused the collapse caused the collapse of the building’, since one is explanatory, and the other is not (Marras, 1998).

Nonreductive physicalists typically solve the explanatory exclusion problem in one of two ways. First, as discussed, the explanatory exclusion principle allows for two explanations of the same event, so long as the explanations are not independent. Thus, if the mental explanation is dependent upon the distinct complete physical explanation, then explanatory exclusion pressure need not arise. Plausibly, mental explanations are dependent upon physical explanations by virtue of the fact that the mental ontologically supervenes upon the physical (Melnyk, 1996). Thus, ‘Jennie’s fear explains her increased heart rate’ is a distinct but ontologically dependent explanation of the same causal relation that is explained by ‘hormone release from her amygdala explains her increased heart rate’. This solution bares certain similarities with the compatibilist solution to the causal exclusion problem, which itself posits distinct but dependent mental causes of the same behavioural effects. Likewise, it is open to the charge that distinct but dependent mental explanations can be excluded on account of the fact that behavioural effects have a complete physical explanation.

Second, rather than attempting to resolve the explanatory exclusion problem, many nonreductive physicalists dismiss the explanatory exclusion principle as a needlessly stringent constraint on explanation. They argue that there is no difficulty with describing the same event in multiple ways (Arnadottir and Crane, 2013, 256). The red rose can be re-described as the red-or-green rose, which can be re-described as the red flower, which can be re-described as the apple coloured rose, etc. Similarly, the same causal relation between events can be described in microphysical terms, neuroscientific terms, or psychological terms, so mental explanations need not be excluded. This solution relies upon acceptance of the contestable view that the value of parsimony does not apply to explanation. Moreover, even if it is true that behavioural events can have physical and mental explanations, it is worrisome that behavioural events need not have mental explanations, given that they already have complete physical explanations.

5. The Generalization Problem

To this point, the causal exclusion problem has been restricted to the domain of mental causation, so only mental events have been in danger of being excluded from causal efficacy. Numerous philosophers worry, however, that the causal exclusion problem might generalize. That is, if mental causes are excluded by the sufficiency of subvening neural causes of behavioural effects, then perhaps neural causes are excluded by the sufficiency of the subvening chemical causes of behavioural effects. These behavioural effects are, in turn, excluded by the sufficiency of the subvening microphysical causes of behavioural effects (Kim, 1997; Burge, 1993, 102). This generalization problem leads to the following two problems. First, not only are mental causes threatened by the causal exclusion problem, but the causal efficacy of all special science properties is now threatened. Second, if there is no bottom level to physics, then all causal efficacy may drain away, since microphysical causation would be excluded by lower level quantum processes, which would in turn be excluded by lower level processes, and so on and so forth (Block, 2003; Walter, 2008). These problems are so severe that they are sometimes treated as reductio ad absurdum arguments against the causal exclusion problem.

Kim’s initial response to the generalization problem is that the causal exclusion problem does not generalize, since the exclusion engendering relations holding between mental and physical events is dissimilar to the relations holding between special science entities. The relation between the mental and the physical is a relation between higher order properties and lower order properties, where both of these properties are instantiated by the same substance. In this case, exclusion pressures arise, as the lower order properties of the substance do all the work, excluding the efficacy of other properties of the substance. On the contrary, the relation between special science properties and their bases is a relation between higher level structural properties of a whole and properties of lower level parts, where these properties are instantiated by different substances. Properties instantiated in different substances need not causally compete—the 10kg weight of the table causes the scale to tip to 10kg, and the table’s weight is not excluded by the 6kg weight of the top and the 4kg weight of the pedestal, since neither of those parts can make the scale tip to 10kg.

Numerous critics reject this response to the generalization problem (Block 2003; Noordhof, 1999). Of central concern is that higher level structural properties are supervenient upon the properties and relations of the lower level parts taken together. The combined properties and relations of the lower level parts is a sufficient cause of whichever effect occurs, thereby excluding supervening higher level structural properties of wholes. The 10kg weight of the table is excluded from causing the scale to tip to 10kg by the combined weight of the pedestal and top. After all, the scale does not tip to 20kg when the pedestal and top, and the table as well, sit upon it.

There are two other replies to the generalization problem that are worth discussing. First, the reductionist response claims that the structural property of the higher level whole is identical to the properties and relations of the lower level parts. Since the higher level structural property is identical with the causally efficacious lower level state of affairs, the higher level structural property is causally efficacious (Kim, 2005, 69). And, because the identity between higher level structural properties and lower level states of affairs holds all the way down the mereological scale, there is no fear of causal powers draining away to a bottomless level (Kim, 2005, 68).

There are several considerations weighing against this reductive solution to the generalization problem. First, the higher level structural property is singular while the lower level properties of, and relations among, the parts is a plurality. A water molecule, for example, is singular. The hydrogen atom, the oxygen atom, the other hydrogen atom, and the binary bonding relations holding between individual atoms are a plurality. It is difficult to see how a singularity can be identical with a plurality (Moore, 2010). Second, higher level wholes are multiply composable (Block, 2003, 145). For example, the same bicycle can have a Mavik tire or a Michelin tire functioning as its front wheel. If, however, the bicycle is identical to its lower level parts and relations, which includes the Mavik tire, and then the Mavik tire is replaced by a Michelin tire, then the bicycle is not the same bicycle after this alteration. This discussion not only intersects with debates in the philosophy of science, but also interacts with longstanding debates in mereology, such as the questions of mereological essentialism and whether composition is identity.

It is also possible to adopt a nonreductivist response to the generalization problem, according to which higher level structures are distinct from their lower level parts and relations. This response bares affinity with the emergentist response to the causal exclusion problem. The task of the nonreductivist is to demonstrate how higher level structures have causal powers above and beyond the causal powers of the parts and their relations. To this end, it is uncontroversial that structure is efficacious. For example, fructose and sucrose are isomers, both composed of C6H12O6. The fundamental elements are the same, and they have the same properties. Fructose and sucrose, however, are structured differently, and so they have different properties. Fructose is sweeter than sucrose, and causes less insulin secretion in humans than sucrose, for example. So, plausibly, higher level structure provides novel efficacy. The question is whether the lower level parts of fructose, with their properties, in their specific relations, are sufficient causes for these effects. If they are not, then nonreduced higher level structure has novel causal powers, but the completeness of the lower level physical level is questioned. If they are, then the completeness of the lower level physical level is established, but the efficacy of the higher level structure may be excluded.

6. Conclusion

While no solution to the causal exclusion problem has enjoyed widespread acclaim, there are several flourishing avenues of response. Chief among them are the compatibilist response, and appeals to differing models of causation, though discussion in other areas is ongoing as well. The causal exclusion problem, however, has a manner of re-establishing itself after it seems to have been solved. So, it is unlikely that the problem will dissipate in the short term. It is clear, however, that the overriding aspiriation of philosophers is to find a nonreductive, physicalist solution to the causal exclusion problem as few follow Kim’s reductive conclusions.

7. References and Further Reading

  • Arnadottir, S., and Crane, T. (2013). “There is No Exclusion Problem”. In Mental Causation and Ontology, edited by S. Gibb, and R. Ingthorsson, p. 248–265. Oxford: Oxford University Press.
  • Audi, P. (2012). “Properties, Powers, and the Subset Account of Realization”. Philosophy and
  • Phenomenological Research, 84, 3, p. 654-674.
  • Bedau, M. (1997). “Weak Emergence”. Philosophical Perspectives, 11, p. 375-399.
  • Baumgartner, M. (2010). “Interventionism and Epiphenomenalism”. Canadian Journal of Philosophy, 40, 3, p. 359-383.
  • Bennett, K. (2003). “Why the Exclusion Problem Seems Intractable, and How, Just Maybe, to Tract it”. Noûs, 37, p. 471-49.
  • Bennett, K. (2008). “Exclusion Again”. Being Reduced.  Hohwy J. and Kallestrup J. (Eds.) Oxford: Oxford University Press, p. 280-305.
  • Bermudez, J. & Arnon, J. (2015). “Mental Causation and Exclusion”. Humana Mente, 29, p. 47-68.
  • Block, N. (1990). “Can the Mind Change the World?” Meaning and Method: Essays in Honor of Hilary Putnam. Cambridge: Cambridge University Press.
  • Block, Ned (2003). “Do Causal Powers Drain Away?” Philosophy and Phenomenological Research, 67, p. 133-150.
  • Brewer, B. (1995). “Mental Causation: Compulsion by Reason”. Aristotelian Society Supplementary, 69, p. 237-253.
  • Burge, T. (1993). “Mind-Body Causation and Explanatory Practice”. Mental Causation. Heil, John and Mele, Alfred (eds.). Oxford: Clarendon Press, p. 97-120.
  • Carey, B. (2011). “Overdetermination and the Exclusion Problem”.  Australasian Journal of Philosophy, 89, 2, p. 251-262.
  • Chalmers, D. (1996). The Conscious Mind. New York, Oxford University Press.
  • Crane, T. and Mellor, D (1990). “There is No Question of Physicalism”. Mind 99, p. 185-206.
  • Davidson, D. (1963). “Actions, Reasons and Causes”. Journal of Philosophy, 60, p. 685-700.
  • Davidson, D. (1980). Essays on Actions and Events. Clarendon Press: Oxford.
  • Davidson, D. (1993). “Thinking Causes”. Mental Causation. Heil, John and Mele, Alfred (Eds.) Oxford: Clarendon Press, p. 3-18.
  • Dennett, D. (1991). Consciousness Explained. Penguin Press
  • Fodor, J. (1989). “Making Mind Matter More”. Philosophical Topics, 17, 1, p. 59-79.
  • Fodor, J. (1974). “Special Sciences, or the Disunity of Science as a Working Hypothesis”. Synthese, 28, p. 77-115.
  • Gadenne, V. (2006). “In Defense of Qualia Epiphenomenalism”. Journal of Consciousness Studies, 13, 1-2, p. 101-114.
  • Gibb, S. (2004). “The Problem of Mental Causation and the Nature of Properties”. Australasian Journal of Philosophy, 82, p. 464-475.
  • Gibb, S. (2009). “Explanatory Exclusion and Causal Exclusion”. Erkenntnis, 71, p. 205-221.
  • Gibb, S. (2015). “Defending Dualism”. Proceedings of the Aristotelian Society, 115, 2, p. 131-146.
  • Gibbons, J. (2006). “Mental Causation Without Downward Causation”. Philosophical Review, 115, p. 79-103.
  • Harbecke, J. (2008). Mental Causation: Investigating the Mind”s Powers in a Natural World, Frankfurt: Ontos Verlag.
  • Hendry, R. (2010). “Emergence vs. Reduction in Chemistry”.  Emergence in Mind, MacDonald, C. and MacDonald, G. (eds.), Oxford: Oxford University Press, p. 205-221.
  • Honderich, T (1982), “The Argument for Anomalous Monism”. Analysis, 42, p. 59-64.
  • Horgan, T. (1997). “Kim on Mental Causation and Causal Exclusion”. Nous Supplement: Philosophical Perspectives,11, p. 165-184.
  • Horgan, T. (2001). “Causal Compatibilism and the Exclusion Problem”. Theoria, 16, p. 95-116.
  • Humphreys, P. (1997). “How Properties Emerge”. Philosophy of Science, 64, p. 1–17.
  • Jackson, F. (1982). “Epiphenomenal Qualia”. The Philosophical Quarterly, 32, 127, p. 127-136.
  • Johansen, M. 2014. “Causal Contribution and Causal Exclusion”. Philosopher’s Imprint, 14, 33, 2-16.
  • Kallestrup, J. (2006). “The Causal Exclusion Argument”. Philosophical Studies, 131, p. 459-485.
  • Kim, J. (1976). “Events as Property Exemplifications”. Supervenience and Mind, Cambridge: Cambridge University Press, p. 33-52.
  • Kim, J. (1988). “Explanatory Realism, Causal Realism, and Explanatory Exclusion”. Midwest Studies in Philosophy, 12, p. 225-239.
  • Kim, J. (1989). “Mechanism, Purpose, and Explanatory Exclusion”. Nous-Supplement: Philosophical Perspectives, 3, p. 77-108.
  • Kim, J. (1993). Supervenience and Mind. Cambridge: Cambridge University Press.
  • Kim, J. (1993b). “Can Supervenience and ‘Non-Strict Laws’ Save Anomalous Monism”. Mental Causation. Heil, John and Mele, Alfred (Eds.). Oxford: Clarendon Press, p. 18-26.
  • Kim, J. (1997). “Does the Problem of Mental Causation Generalize?” Proceedings of the Aristotelian Society, 87, p. 281-297.
  • Kim, J. (1998). Mind in a Physical World. Cambridge: MIT Press.
  • Kim, J. (1999). “Making Sense of Emergence”. Philosophical Studies 95, p. 3-36.
  • Kim, J. (2005). Physicalism, or Something Near Enough.  Princeton: Princeton University Press.
  • Kim, J. (2007). “Causation and Mental Causation”. Contemporary Debates in Philosophy of Mind. McLaughlin, B. and Cohen, J. (eds). Victoria: Blackwell.
  • Kim J. (2009). “Mental Causation”. Oxford Handbook of Philosophy of Mind. McLaughlin, B. and Beckermann, A. and Walter, S. (Eds.). Oxford: Oxford University Press, p. 29-52.
  • Kim, J. (2010). Essays in the Metaphysics of Mind. Oxford: Oxford University Press.
  • Kroedel, T. (2015). “Dualist Mental Causation and the Exclusion Problem”. Noûs 49 (2): 357–375.
  • Lewis, D. (1986). Philosophical Papers: Volume II. Oxford: Oxford University Press.
  • List, C. & Menzies, P. (2009). “Nonreductive Physicalism and the Limits of the Exclusion Principle”. Journal of Philosophy, 106, 9, p. 475-502.
  • Loewer, B. (2002). “Comments on Jaegwon Kim’s Mind and the Physical World”. Philosophy and Phenomenological Research, 65, p. 655-662.
  • Loewer, B. (2007). “Mental Causation, or Something Near Enough”. Contemporary Debates in Philosophy of Mind. McLaughlin, B. and Cohen, J. (Eds). Malden: Blackwell Publishing, p. 243-264.
  • Lowe, E. (2000). “Causal Closure Principles and Emergentism”. Philosophy, 75, p. 571-586.
  • Lowe, E. (2008). Personal Agency. Oxford: Oxford University Press.
  • MacDonald, C. and MacDonald, G. (2006). “The Metaphysics of Mental Causation”. The Journal of Philosophy, 103, p. 539-576.
  • MacDonald, G. (2007). “Emergence and Causal Powers”.  Erkenntnis, 67, p. 239-253.
  • Malcolm, N. (1968). “The Conceivability of Mechanism”. Philosophical Review. 77, p. 45-72.
  • Marcus, E. (2005). “Mental Causation in a Physical World”. Philosophical Studies, 122, p. 27-50.
  • Marras, A. (1998). “Kim’s Principle of Explanatory Exclusion”. Australasian Journal of Philosophy, 76, p. 439-451.
  • Meixner, U. (2008). “New Perspectives for a Dualistic Conception of Mental Causation”. Journal of Consciousness Studies, 15, p. 17-38.
  • Melnyk, A. (1996). “Testament of a Recovering Eliminativist”. Philosophy of Science, 63, p. S185-S193.
  • Menzies, P. (2013). “Mental Causation in a Physical World”.  S. Gibb & R. Ingthorsson (eds.). Mental Causation and Ontology. Oxford: Oxford University Press.
  • Menzies, P. (2015). “The Causal Closure Argument is Non Threat to Non-Reductive Physicalism”. Humana Mente, 29, p. 21-46.
  • Montero, B. (2003). “Varieties of Causal Closure”. Physicalism and Mental Causation. S. Walter and H. Hackmann (Eds.). Exeter: Imprint Academic, p. 173-187.
  • Moore, D. (2010). “The Generalization Problem and the Identity Solution”. Erkenntnis, 72 (1): 57-72.
  • Moore, D. & Campbell, N. (2015). “On the Metaphysics of Mental Causation”. Abstracta, 8, 2, p. 3-16.
  • Moore, D. (2017). “Mental Causation, Compatilism, Counterfactuals”. Canadian Journal of Philosophy, 47, 1, p. 20-42.
  • Ney, A. (2007). “Can an Appeal to Constitution Solve the Exclusion Problem?” Pacific Philosophical Quarterly, 88, p. 486-506.
  • Noordhof, P. (1999). “Micro-Based Properties and the Supervenience Argument”. Proceedings of the Aristotelian Society, 99, p. 109-114.
  • O’Connor, T., and Wong, H. (2005). “The Metaphysics of Emergence“. Noûs, 39, p. 658-678.
  • Pauen, M. (2006). “Feeling Causes”. Journal of Consciousness Studies, 13, 1, p. 129-152.
  • Paul, L. & Hall, N. (2013). Causation: A User’s Guide. Oxford: Oxford University Press.
  • Papineau, D. 1993. Philosophical Naturalism. Oxford: Blackwell
  • Papineau, D. (2001). “The Rise of Physicalism”. Physicalism and its Discontents. Gillett, C. and Loewer, B. (Eds.) Cambridge: Cambridge University Press, p. 3-36.
  • Papineau, D. (2002). Thinking About Consciousness. Oxford: Oxford University Press.
  • Pereboom, D. (2002). “Robust Nonreductive Materialism”. Journal of Philosophy, 99, p. 499-531.
  • Pineda, D. (2002). “The Causal Exclusion Puzzle”. European Journal of Philosophy 10: 26-42.
  • Pineda, S. & Vicente, A. (2017). “Shoemaker’s Analysis of Realization: A Review”. Philosophy and Phenomenological Research 94: 97-120.
  • Putnam, H. (1967). “Psychological Predicates”. Art, Mind and Religion, Capitan, W. and Merrill, D. (Eds.). Pittsburgh: University of Pittsburgh Press, p. 37-48.
  • Robinson, W. (2006). “Knowing Epiphenomena”. Journal of Consciousness Studies, 13, 1-2, p. 85-100.
  • Sider, T. (2003). “What’s So Bad About Overdetermination”. Philosophy and Phenomenological Research, 67, p. 719 – 726.
  • Silberstein, M. (2001). “Converging on emergence”. Journal of Consciousness Studies, 8, p. 61-98.
  • Silberstein M., and McGeever, J. (1999). “The Search for Ontological Emergence”. The Philosophical Quarterly, 49, p. 182-200.
  • Shoemaker, S. 2007. Physical Realization. Oxford: Oxford University Press.
  • Shoemaker, S. (2011). “Realization, Powers and Property Identity”. The Monist, 94, 1, p. 3-18.
  • Slors, M. and Walter, S. (2002). “Introduction”. Mental Causation, Multiple Realization, and Emergence, Slors, M. and Walter, S. (eds).  Rodopi.
  • Sosa, E. (1984). “Mind-Body Interaction and Supervenient Causation”. Midwest Studies in Philosophy, 9, p. 271-281.
  • Thomasson, A. (1998).  “A Nonreductivist Solution to Mental Causation”. Philosophical Studies, 89, p. 181-195.
  • Vicente, A. (2006). “On the Causal Completeness of Physics”. International Studies in the Philosophy of Science, 20, 2, p. 149-171.
  • Walter, S. (2008). “The Supervenience Argument, Overdetermination, and Causal Drainage”. Philosophical Psychology, 21: 673-696.
  • Wilson, J. (2011). “Non-Reductive Realization and the Powers-Based Subset Strategy”. The Monist, 94, 1, p. 121-154.
  • Woodward, J. (2003). Making Things Happen. Oxford: Oxford University Press.
  • Woodward, J. (2015). “Interventionism and Causal Exclusion”. Philosophy and Phenomenological Research, 91, 2, p. 303-347.
  • White, B. (2017). “Conservation Laws and Interactionist Dualism”. Philosophical Quarterly, 67, 267, p. 387-405.
  • Whittle, A. (2007). “The Co-Instantiation Thesis”. Australasian Journal of Philosophy, 85, p. 61-79.
  • Witmer, G. (2003). “Functionalism and Causal Exclusion”. Pacific Philosophical Quarterly, 84, p. 198-214.
  • Wyss, P. (2010).  “Identity With a Difference”. Emergence in Mind, MacDonald, C. and MacDonald, G. (eds.), Oxford: Oxford University Press, p. 169-179.
  • Zangwill, N. (1996). “Good Old Supervenience: Mental Causation on the Cheap”. Synthese, 106, 1, p. 67-101.
  • Zhong, L. 2014. “Sophisticated Exclusion and Sophisticated Causation”. Journal of Philosophy 111: 361–380.

 

Author Information

Dwayne Moore
Email: dwayne.moore@usask.ca
University of Saskatchewan
Canada

Two-Dimensional Semantics

Two-dimensional (2D) semantic theories distinguish between two different aspects, or ‘dimensions’, of the meaning of linguistic expressions. Many other theories identify the meaning of an expression with a dependency of its extension on the state of the world. (The extension of a sentence is its truth-value, and the extension of a sub-sentential expression is the object or objects it applies to.) Consider the following, true sentence:

(1) Anand is a chess player.

If Anand had decided to spend his time very differently, sentence (1) would be false. Which extension this sentence has thus depends on whether a specific individual plays a particular game. One could hold, in line with a common view, that the meaning of (1) is captured by this dependency of its truth-value on Anand’s relation to chess. But notice that there is more than one way in which an expression’s extension depends on the state of the world. For example, in counterfactual circumstances in which the speakers in our linguistic community use the word ‘chess’ so that it exclusively applies to what we call ‘tennis’, (1) would be false. 2D semantic theories identify two kinds of dependencies of extension on the world, both of which are meant to represent important aspects of meaning.

2D semantics is a version of possible worlds semantics. Such theories standardly capture dependency of extension on the world by means of an intension, that is, a function from possible worlds to extensions. 2D semantic theories postulate at least two intensions that capture two kinds of dependencies of extension on the world. One intension, which is sometimes called the ‘2-intension’, corresponds to the first way of construing the dependency of the extension of (1) on the world just outlined. This intension returns ‘True’ for all and only those worlds in which Anand plays chess. In addition, all 2D semantic theories introduce another intension, sometimes called the ‘1-intension’.

Proponents of 2D semantic theories generally agree about how to construe 2-intensions. However, they differ in how they construe 1-intensions, and in what they take to be the theoretical purposes of 1-intensions. Concerning the latter issue, 1-intensions have generally been taken to capture either epistemic features associated with linguistic expressions—such as apriority or cognitive significance—or matters related to context-dependency. There is also disagreement about what kinds of items the theory should be applied to. It is widely accepted that a 2D semantics can be fruitfully applied to some kinds of expressions, such as indexicals. Accounts that apply a 2D semantics only to specific kinds of expressions are called ‘local accounts’ in what follows. Other researchers have argued, more controversially, that 2D semantics is a useful tool for characterizing the meanings of all kinds of expressions, or even the contents of mental states. Accounts that apply a 2D semantics to all kinds of expressions are called ‘global accounts’ in what follows.

The philosophical significance of 2D semantics extends far beyond the philosophy of language. Issues concerning 2D semantics and its interpretation have been at the heart of debates about the mind-body problem and philosophical methodology.

Table of Contents

  1. Introduction to 2D Semantics
    1. From Extensions to Intensions
    2. Rigid Designators and Externalism
    3. From 1D to 2D
  2. Local Accounts
    1. Content and Character (Kaplan)
    2. Superficial and Deep Modality (Evans, Davies & Humberstone)
  3. Global Accounts
    1. Metasemantic 2D Semantics (Stalnaker)
    2. Epistemic 2D Semantics (Chalmers, Jackson)
      1. Modal Rationalism
      2. The Scrutability of Truth
      3. Philosophical Methodology
  4. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Introduction to 2D Semantics

a. From Extensions to Intensions

Meaningful linguistic expressions have extensions. The extension of a sentence, such as (1), is its truth-value, in this case True. The extension of a general term, such as ‘chess player’, is the class of individuals to which the term applies, in this case the class of people who play chess. The extension of a singular term, such as ‘Anand’, is the individual denoted by the term, in this case Anand. Extension is not the same as meaning. ‘Anand’ is not synonymous with ‘the 15th world chess champion’, even though the expressions denote the same individual. Even more obviously, (1) is not synonymous with ‘Serena Williams is a tennis player’, even though the expressions have the same truth-value.

A popular idea is to characterize meanings as truth- or application-conditions, that is, the conditions under which a sentence is true or the conditions under which an expression is correctly applied. For instance, (1) is true if and only if Anand is a chess player. Given a particular state of the world, these truth-conditions then determine the truth-value of the sentence. Truth- and application-conditions capture the (or a) dependency of an expression’s extension on the state of the world. In possible world theories of meaning, they are modeled as intensions. An intension is a function that assigns extensions with respect to possible worlds. For example, the intension of (1) assigns True with respect to all and only those worlds in which Anand is a chess player; and the intension of ‘chess player’ with respect to a particular world assigns all and only those individuals that play chess in that world. According to such possible worlds accounts, the meaning of an expression is thus its intension.

One way to motivate this idea is by arguing that the primary use of language is to exchange information, which suggests that the meaning of an expression (or at least a crucial aspect of its meaning) is its information content. Furthermore, information can be defined as the exclusion of possibilities, and possibilities are commonly characterized by means of possible worlds: Something is possible if and only if there is a possible world in which it is the case. Accordingly, the information conveyed by sentence (1) excludes all possible worlds in which it is not the case that Anand is a chess player. The remaining worlds are precisely those to which the intension of (1) assigns True. Hence, the intension of an expression is well suited for capturing its information content.

Another virtue of possible worlds theories of meaning is that they allow us to assign different meanings to expressions that share the same extension, thus respecting intuitive judgments about synonymy. For example, there is a possible world in which Anand never plays chess but in which Serena Williams is a tennis player. With respect to this world, (1) and ‘Serena Williams is a tennis player’ are assigned different truth-values, which implies that their intensions differ. Likewise, there is a possible world in which Gelfand wins the 2007 world chess championship tournament, thereby becoming the 15th world champion. With respect to this world, ‘Anand’ and ‘the 15th world chess champion’ have different extensions, which implies that their intensions differ as well.

b. Rigid Designators and Externalism

Saul Kripke (1980) forcefully argued that some types of expressions, among them proper names and indexicals, are rigid designators, that is, they refer to the same individual with respect to every possible world. Suppose, for instance, that Gelfand claims ‘Anand could have become a professional tennis player’. It seems obvious that this statement is about the same individual that (1) is about, that is, Anand. The statement could not be made true by someone else doing something in some possible world. Hence, when one uses the name ‘Anand’ to talk about counterfactual circumstances, one still talks about the same person. Likewise, Anand can use the indexical ‘I’ to talk about what he himself would have done in counterfactual situations. This suggests that proper names such as ‘Anand’, and indexicals such as ‘I’, ‘here’, and ‘now’, are rigid designators.

Most philosophers have accepted Kripke’s claim that names and indexicals are rigid designators. In fact, the claim made above that (1) is true in all and only those worlds in which Anand is a chess player already presupposed that the name ‘Anand’ is a rigid designator. This illustrates that in a possible worlds account of meaning, the existence of rigid designators has immediate consequences for the meanings of expressions containing them.

A less obvious consequence that many have drawn from the existence of rigid designators, following Kripke (1980) and Hilary Putnam (1975), is that the meaning of linguistic expressions is not determined by a subject’s intrinsic properties, that is, that meaning externalism is true. Take the name ‘Anand’, as used by Gelfand. A person who is intrinsically identical to Gelfand might refer to a different person by saying ‘Anand’—this might be because his ‘Anand’-utterances are causally related to this other person. Hence, the utterances of Gelfand and his twin have different extensions. Assuming that names are rigid designators, they have different intensions as well—one utterance picks out Anand with respect to every possible world, the other picks out some other person with respect to every possible world. If intensions are meanings, the two subjects’ utterances also have different meanings.

Probably the most famous argument for externalism is provided by Putnam’s ‘Twin Earth’ thought experiment (1975, 139–144). Suppose that on Twin Earth, which is a planet in a remote part of our galaxy (or in another possible world), there is a substance that is called ‘water’ by the inhabitants of this planet and that shares all of its superficial properties with our water: It is colorless, odorless and drinkable, it falls out of grey clouds and is the dominant substance in rivers and lakes, and so forth. However, this substance has a different molecular structure than water, viz. XYZ. According to Putnam, XYZ is not water, and the term ‘water’ has a different meaning on Earth than it does on Twin Earth. One way to support these claims is by appealing to the fact that ‘water’ and other terms for natural kinds, such as ‘tiger’, ‘electron’, and ‘gold’, rigidly denote the kind they pick out in the actual world. Given this, ‘water’ as used by Oscar on Earth picks out H2O with respect to all possible worlds and thus has a different intension from ‘water’ as used by Twoscar, who lives on Twin Earth. At the same time, as Putnam notes, Oscar and Twoscar might be intrinsically identical. Hence, the intension of ‘water’ is not determined by the intrinsic properties of a speaker, and so, it seems that neither is the expression’s meaning.

All the major proponents of 2D semantics accept that proper names, indexicals, and natural kind terms are rigid designators. They also accept that at least one aspect of meaning is not determined by a speaker’s intrinsic properties.

c. From 1D to 2D

The possible worlds account just sketched in effect yields a one-dimensional (1D) semantic theory, in which the meaning of an expression is modeled by means of a single intension. Proponents of 2D semantics, by contrast, hold that at least for some kinds of expressions, a single intension is not enough to capture their meaning. A natural way to motivate this claim is by considering indexical expressions. Take the following sentence:

(2) I am a chess player.

Assuming that (2) is uttered by Anand, and given that ‘I’ is a rigid designator, the intension of (2) is true with respect to all and only those worlds in which Anand is a chess player. It is thus identical to the intension of (1). However, it seems clear that Anand’s utterance is not synonymous with an utterance of (1). Furthermore, assume that Serena Williams utters (2). The intension of this utterance is true in all and only those worlds in which Williams is a chess player. It thus differs from the intension of Anand’s utterance. So, the intension of (2) varies between different tokens of (2)’s type. In a way, this seems adequate—there is a sense in which Williams and Anand express something different by uttering (2). But it seems that there is also a sense in which what they express is the same. More generally, it is natural to think that indexicals such as ‘I’, ‘here’, and ‘now’, have a stable meaning, even if they are uttered by different people, at different places, and at different times. Accordingly, the sentence (2), that is, the sentence type, and other indexical expressions should also have stable meanings that do not vary with its producer.

The intension of an utterance involving an indexical and, consequently, its extension systematically depends on the circumstances in which the utterance is produced. For example, the intension of (2) is true with respect to a world w if and only if the individual who in fact (but not necessarily in w) utters this sentence is a chess player in w. According to proponents of 2D semantics, this kind of dependency must be captured when giving an account of the meaning of indexical expressions. Here is one way to systematize this dependency. The circumstances in which an expression is uttered are often called ‘context of use’. The worlds with respect to which (according to 1D accounts) an expression’s intension outputs an extension are often called ‘circumstances of evaluation’.

Now assume, first, that one wants to capture how the truth-value of indexical expressions varies depending on the expressions’ context of use. In 2D semantics, this is done by means of a 1-intension:

(1-Intension) A 1-intension is a function from contexts of use to extensions.

For example, relative to a context in which Serena Williams is not a chess player and utters (2), the 1-intension of her utterance is false. In 1-intensions, the context of use also serves as the circumstances of evaluation. In 1D accounts, in which expressions only have one intension, these intensions are construed differently. There, it is assumed that the context of use is fixed, that is, an expression is uttered by a specific individual, at a particular place, and at a particular time. Then, the expression is assigned an extension relative to different circumstances of evaluation. Within a 2D account, this kind of intension is a 2-intension:

(2-Intension) A 2-intension is a function from circumstances of evaluation to extensions.

For example, if Serena Williams utters (2), then her utterance is true with respect to counterfactual circumstances of evaluation in which Serena Williams is a chess player. The 2-intensions of indexical expression types vary between different contexts of use. This is precisely why one needs to appeal to 1-intensions to characterize the meanings of indexical expressions. However, every expression token, that is, every utterance, has a context of use and therefore also has a 2-intension. How 2-intensions of expressions vary between different contexts of use, that is, how they depend on contexts of use, can itself be captured by another, 2D-intension. It is a function from contexts of use to 2-intensions. Equivalently, it can be defined as a function that takes pairs of a context of use and circumstances of evaluation as input and delivers an extension as output:

(2D-Intension) A 2D-intension is a function from pairs of contexts of use and circumstances of evaluation to extensions.

A note concerning the construal of contexts of use and circumstances of evaluation: Circumstances of evaluation can simply be understood as possible worlds. It is natural to understand contexts of use as possible worlds as well. However, there is a catch. In many possible worlds, a great number of utterances are made. Consider, for instance, a world in which both Anand and Williams utter (2). If contexts of use are just possible worlds, then it is impossible to identify the utterance that is to be assigned an extension. A common way to solve this problem is to construe contexts of use as centered worlds (compare Lewis 1979). A centered world is a triple of a possible world, an individual, and a time. A centered world can thus serve to pick out the relevant utterance by specifying, or ‘marking’, the producer of the utterance and the time at which it is uttered.

All of the intensions just defined can be represented in a 2D matrix. Figure 1 below depicts a snippet of the 2D matrix of sentence (2). In each centered world, the individual at the center utters (2) at the marked time. The worlds involved have the following character—notice that centered worlds are flagged by a ‘*’, and that any wn* differs from any wn only in that the former involves a center.

w1* is centered on Anand. In w1, Anand and Gelfand are chess players, and Williams is not.

w2* is centered on Gelfand. In w2, Gelfand and Williams are chess players, and Anand is not.

w3* is centered on Williams. In w3, Anand is a chess player, and Gelfand and Williams are not.

figure 1Figure 1

The worlds on the left side, marked with an ‘*’, are contexts of use, understood as centered possible worlds. The worlds on the top are circumstances of evaluation, understood as possible worlds. Notice that the class of contexts of use and the class of circumstances of evaluation are identical, the only difference being the presence or absence of centering. For the purposes of illustration, here is how the second row of the matrix is evaluated. In this row, w2* is assumed to be the context of use, in which Gelfand utters (2). Now this utterance is evaluated with respect to different circumstances of evaluation. Since Gelfand is a chess player in w1 and in w2, but not in w3, the first two cells in this row get assigned a ‘T’ (for True), while the third one gets assigned an ‘F’ (for False). Now assume that w1* is the actual world. That is to say, the utterance of (2) that we consider is in fact produced by Anand, who is a chess player. Then the three kinds of intensions identified above are represented in the matrix as follows. The top row of the matrix, with respect to which the actual world, centered on Anand, is the context of use, represents the 2-intension of (2). The other rows represent 2-intensions that (2) could have had, if it had been uttered in different contexts. The 1-intension is represented by the diagonal that runs from the top left to the bottom right of the matrix. The matrix itself represents the utterance’s 2D-intension.

According to the 2D account just sketched, indexical expressions are associated with three intensions. This raises the question: Which of these intensions represents the meaning of these expressions? The 1-intensions and 2D-intensions of indexical expressions do not vary with their context of use. This distinguishes them from 2-intensions and makes them more suitable for representing the meaning of such expressions. Relatedly, it is plausible that subjects who know the meaning of an indexical expression are able to evaluate 1-intensions and 2D-intensions. For example, a speaker is able to say that if, in the context of use, a is the speaker, and if, in the circumstances of evaluation, a is a chess player, (2) is true. An expression’s 2-intension, however, cannot be evaluated on the basis of mere semantic competence, because semantic competence does not provide knowledge of the context of an utterance. These considerations suggest that both the 1-intension and the 2D-intension of an indexical expression are better candidates for representing its meaning. But 2D accounts are not committed to the claim that one of these intensions represents the meaning of such an expression. Rather, proponents of 2D semantics could say that all three intensions represent important aspects of meaning. For instance, as was mentioned above, when Williams and Anand utter (2), one would like to say that in one sense, they expressed the same thing, and in another sense, they expressed something different. The sense in which what is expressed differs is reflected in the differences between the 2-intensions of the respective utterances. 2D semantics thus provides the resources to capture all these aspects of meaning.

All 2D semantic theories share the basic structure represented by a 2D matrix. This structure is 2D because the worlds involved play two different roles. Above, these roles were introduced as the contexts of use on the one hand, and the circumstances of evaluation on the other. However, notice that, while this is a natural and popular understanding of the two roles of possible worlds in 2D semantics, some 2D accounts understand these two roles in slightly different ways. In particular, the construal of the worlds involved in the ‘first dimension’, that is, the centered worlds listed on the left of each row in the 2D matrix, is contested. As a consequence, the construal of 1-intensions is contested as well. 2D accounts also differ in other respects. There is widespread agreement that a 2D account is well suited to describe the meanings of indexicals, and of indexical expressions. However, whether or not a 2D account should also be applied to other kinds of expressions, and if so, to what kinds of expressions, is controversial.

2. Local Accounts

a. Content and Character (Kaplan)

We just saw that indexicals provide a natural motivation for adopting a 2D theory. It is therefore unsurprising that the first 2D theories were introduced as accounts of indexical expressions (compare for example, Kamp 1971). One such account that has been particularly influential is David Kaplan’s general semantic theory of indexicals (compare Kaplan 1989). According to Kaplan, expressions have contents. These contents are supposed to correspond to ‘what is said’ by the relevant expression. Furthermore, the content of a sentence token is a proposition. Kaplan’s way of construing propositions, and contents in general, requires some elaboration. On his view, propositions are structured entities. The content of Anand’s utterance of (2), for instance, is a singular proposition that consists of Anand himself and the property of being a chess player. (A proposition is singular if it has an individual as its constituent.) According to Kaplan, the content of a singular expression, such as ‘Anand’, is an individual—in this case, Anand. The content of a general expression, such as ‘chess player’, is a property—in this case, the property of being a chess player. The contents of composite expressions then systematically depend on the contents of their parts.

This account of content seems very different from a possible words account. However, the contents postulated by Kaplan can be taken to determine intensions. And in fact, Kaplan often appeals to intensions to characterize contents. For Kaplan, an intension is a function from circumstances of evaluation to extensions. Kaplan’s intensions are basically the 2-intensions introduced above, except that Kaplan favors a different characterization of circumstances of evaluation. For Kaplan, circumstances of evaluation are not just possible worlds. They also include a designated time and potentially other features. Now consider again Anand’s utterance of (2), which expresses a singular proposition containing Anand and the property of being a chess player. This proposition determines an intension that is true with respect to all circumstances of evaluation in which Anand is a chess player.

According to Kaplan, indexicals are directly referential, which is to say that the only contribution they make to the contents of the expressions they figure in is their referent. He takes this to imply that indexicals pick out the same individual with respect to all circumstances of evaluation. Kaplan thus seconds Kripke’s claim that indexicals are rigid designators.

Up to this point, the account described is just a standard 1D account. However, Kaplan argues that content is not all there is to meaning. He therefore introduces another aspect of meaning, character. The character of an expression can be understood as a rule that specifies how the content of the expression depends on the context. For example, for the indexical ‘I’, the rule would be something like this: ‘If x is the producer of the utterance in the relevant context, then x is the content of ‘I’’. Similar rules apply to other indexicals. More formally, characters can be defined as functions from contexts to contents. Hence, on the possible worlds understanding of content, characters are 2D-intensions. The inclusion of characters thus makes Kaplan’s account a type of 2D semantics. Now consider three contexts, in all of which someone utters ‘I’. In w1*, it is Anand; in w2*, it is Gelfand; in w3*, Williams. Given these contexts, Kaplan’s account entails the following snippet of the matrix for the indexical ‘I’:

figure 2
Figure 2

This matrix illustrates Kaplan’s claim that ‘I’ is a rigid designator: First, the reference of ‘I’ is determined by the context, and then the expression picks out the same individual with respect to all circumstances of evaluation.

For ease of exposition, it has so far been assumed that contents are assigned to utterances in Kaplan’s account, and that Kaplan’s contexts are just the contexts of use introduced in § 1. However, both assumptions are not entirely correct. According to Kaplan, contents are assigned to expressions with respect to contexts. (Characters, on the other hand, are assigned to expressions without relativization to anything.) The subject that is the content of, for instance, ‘I’, need not in fact have produced an utterance in a Kaplanian context. On his account, the context does not have to involve an utterance at all. Kaplan states that every context has an agent, a time, and a location within a possible world. (Contexts can thus be understood as centered worlds.) The content of the expression ‘I’ with respect to a context is the agent of the context, where this agent may or may not produce an utterance in the relevant context. This feature of Kaplan’s account has implications for the evaluation of some expressions. For instance, the sentence ‘I utter nothing’ is true with respect to some Kaplanian contexts, while it comes out as false with respect to all contexts of use as they were construed in § 1.

The character of an indexical expression corresponds to what a competent speaker can know in virtue of understanding the expression. The same does not hold for contents, since they vary between contexts. Should one therefore say that the character of an indexical is its meaning? While Kaplan affirms this in several places, he stresses elsewhere that content is also an important aspect of meaning. Again, it does not seem too important to settle the question of what the meaning of indexicals is. What is clear is that both characters and contents play crucial roles in Kaplan’s account of the semantics of indexicals.

On Kaplan’s account, all meaningful expressions can be assigned a character and a content. However, he believes that the characters of many expressions are not very interesting, since they assign the same content with respect to every context. Expressions of this type thus have a constant content. According to Kaplan, proper names fall into this category. In Kaplan’s view, we can say with respect to such expressions that their meaning is just their content. In any case, it would not be theoretically very fruitful to apply Kaplan’s 2D account to expressions with a constant content.

b. Superficial and Deep Modality (Evans, Davies & Humberstone)

Gareth Evans (1979), and Martin Davies & Lloyd Humberstone (1980) applied ideas from 2D semantics to give accounts of both contingent truths that can be known a priori, and of necessary truths that can only be known a posteriori. The existence of both kinds of truths seems to follow straightforwardly from the fact that some expressions are rigid designators. For instance, take the names ‘Hesperus’ and ‘Phosphorus’, both of which refer to the same object, the planet Venus. Since both names are rigid designators, they refer to Venus with respect to every possible world. And this implies that ‘Hesperus = Phosphorus’ is necessarily true. At the same time, it took substantial astronomical research to establish that Hesperus = Phosphorus, and it seems clear that no amount of a priori reasoning could have sufficed to come to know it. Hence, ‘Hesperus = Phosphorus’ is an example of a necessary a posteriori truth (or at least it is a necessary truth that if Hesperus exists, then Hesperus = Phosphorus.) A simple way of formulating contingent a priori truths is by drawing on sentences that contain the expression ‘actual’. This expression is standardly taken to be a device that turns non-rigid expressions into rigid designators. For instance, take the definite description ‘the 15th world chess champion’. This expression picks out Anand; but with respect to a world in which Gelfand wins the 2007 world championship tournament, it picks out Gelfand. The description at issue is therefore not rigid. However, if one adds the word ‘actual’ to it, changing the description to ‘the actual 15th world chess champion’, the new description will pick out the person who in our—the actual—world is the 15th world chess champion (that is, Anand) with respect to every possible world. With this in mind, consider the sentence ‘The actual 15th world chess champion is the 15th world chess champion’. This sentence is contingent—for instance, it is false with respect to the world just mentioned, in which Gelfand, and thus someone other than the actual 15th world chess champion, is the 15th world chess champion. At the same time, the sentence can be known a priori. Hence, ‘The actual 15th world chess champion is the 15th world chess champion’ is a contingent a priori truth.

Many people have found it puzzling that there could be contingent a priori truths and necessary a posteriori truths. If a sentence is contingent, then it seems that its truth depends on features that are not shared by all worlds. It is thus natural to think that to find out whether our world has these features, one needs to do empirical research. On the other hand, if a sentence is necessary, then it seems that its truth does not depend on specific features of our world. It is thus natural to think that to find out whether such a sentence is true, purely a priori reasoning is sufficient.

Evans tries to explain how there can be contingent a priori truths, focusing on examples that arise from what he calls “descriptive names”. To introduce such a name, he stipulates that the name ‘Julius’ is to refer to whoever invented the zipper (Evans 1979, 163). A descriptive name is thus a name whose reference is fixed by a description—in this case, the description ‘the inventor of the zipper’. Evans argues that since descriptive names are names, they are rigid designators. The descriptive name ‘Julius’ thus refers to the same person with respect to every possible world, unlike the definite description ‘the inventor of the zipper’. With this in mind, consider the following sentence:

(3) Julius invented the zipper.

With respect to a possible world in which someone other than the actual inventor of the zipper (Whitcomb Judson) invented the zipper, (3) is false. Hence, (3) is contingent. At the same time, according to Evans, someone who understands the expression ‘Julius’ knows its associated description and is thus in a position to know a priori that (3) is true. Therefore, (3) is a contingent a priori truth. To account for such sentences, Evans introduces a distinction between superficial and deep contingency. Superficial contingency corresponds to the ordinary understanding of contingency—as Evans puts it, whether a sentence is superficially contingent depends on how it “embeds in the scope of modal operators” (1979, 161). Deep contingency, on the other hand, depends on what makes a sentence true: A sentence is deeply contingent only if the world needs to satisfy some condition for this sentence to be true, that is, only if there is some feature that the world needs to have to make it true. What makes a sentence true, according to Evans, is in turn related to the sentence’s content. Accordingly, a deeply necessary sentence is one whose content guarantees its truth. Superficial and deep contingency can come apart because the notion of content is not tied to metaphysical modality. Evans’s notion of content thus differs from the one invoked, for instance, by Kaplan. Following Gottlob Frege (1892/1952), Evans holds that there are epistemic constraints on content: If two sentences have the same content, then a subject who understands both of them cannot believe what one of the sentences says without also believing what the other one says. Evans calls such sentences “epistemically equivalent”.

According to Evans’s distinction, (3) is superficially contingent. Evans also holds that ‘Julius’ and ‘the inventor of the zipper’ have the same content; therefore, (3) is deeply necessary. In his view, there can be no a priori sentences that are deeply contingent. Accordingly, contingent a priori truths are those sentences that are superficially contingent but deeply necessary, that is, those whose truth is guaranteed by their content, even though they are not true in all possible worlds. One might have doubts that such a separation of content and modality is sensible. Take the two sentences ‘Julius is male’ and ‘The inventor of the zipper is male’. These sentences place different demands on a possible world with respect to which they are to be true: ‘Julius is male’ is true with respect to some world if and only if the individual who invented the zipper in the actual world is male, while ‘The inventor of the zipper is male’ is true with respect to some world if and only if the individual who invented the zipper in that world is male. So how could these sentences have the same content, as Evans’s account has it? In response to this kind of worry, Evans points out that the sentences nevertheless place the exact same demands on the actual world: They are both true with respect to the actual world if and only if the individual who invented the zipper in that world is male. Since believing something means believing that it is actually the case, the two sentences are epistemically equivalent. And this, in turn, implies that they have the same content.

By distinguishing between two kinds of modality, Evans draws on a central idea of 2D semantics. Accordingly, deep and superficial modality could, in principle, be used to define 1- and 2-intensions, respectively. The connection to 2D semantics becomes even clearer once one considers the account of Davies & Humberstone (1980), who try to characterize Evans’s distinction between superficial and deep necessity in formal terms. They start from a standard modal logic, with ‘□’ as the sentential operator expressing necessity. Then they introduce the sentential operator ‘A’, which stands for ‘it is actually the case that’. In line with what was said above about ‘actually’-involving expressions, a sentence AS is true with respect to a world if and only if S is true with respect to the actual world. Accordingly, if S is true, then AS is necessarily true, that is, true with respect to every world. But as Davies & Humberstone note, there is an intuitive sense in which some other world might have been actual, and thus, we can consider different worlds as actual. Based on this idea, they introduce another sentential operator, F (for ‘fixedly’), such that FS is true with respect to a world w if and only S is true with respect to w irrespective of which world is considered as actual. Combining these two operators, one can derive another operator, FA, such that FAS is true if and only if S is true with respect to any world that is considered as actual. As Davies & Humberstone point out, the resulting logic can also be characterized in 2D terms. Accordingly, one can evaluate a sentence S with respect to pairs of a world considered as actual and a possible world—this way of evaluating expressions thus yields a kind of 2D-intension.

Davies & Humberstone argue that the distinction between □-truth and FA-truth captures Evans’s distinction between superficial and deep necessity. Accordingly, FAS is true if and only if S is deeply necessary. On Evans’s account, this implies that if S is a priori, then FAS is true. Davies & Humberstone hypothesize that all contingent a priori truths are A-involving. For instance, assume that it is part of the meaning of Evans’s descriptive name ‘Julius’ that it rigidly refers to the inventor of the zipper. Then we can take ‘Julius’ to abbreviate ‘the actual inventor of the zipper’. Given this, (3) is clearly FA-true: No matter which world w is considered as actual, the person who invented the zipper in w actually invented the zipper in w. Other examples of sentences that are FA-true and contingent a priori are easy to come by. These include all sentences of the form S ↔ AS, such as ‘Grass is green if and only if grass is actually green’. Davies & Humberstone also hold that there are many A-involving necessary a posteriori truths. For instance, if S is an ordinary (superficially and deeply) contingent truth, such as ‘Grass is green’, then AS is necessary and a posteriori.

As was noted above, Davies & Humberstone follow Evans in holding that all a priori truths are deeply necessary, which in their framework means that they are FA-true. According to Davies & Humberstone, contingent a priori truths involve a divergence between □-truth and FA-truth—such sentences are FA-true but not □-true—that is due to the involvement of an (implicit) A-operator. If this indeed applies to all contingent a priori truths, then these can be given a unified explanation in their framework. But it is not obvious that all contingent a priori truths are A-involving. Take, for instance, ‘The local theater is a theater’. Given that the expression ‘local’, like other indexicals, is a rigid designator, this sentence is contingent. It is also clearly a priori. But it is less clear that ‘local’, or any other expression in the sentence at hand, is even implicitly A-involving. It is therefore disputable both that Davies & Humberstone can explain all contingent a priori truths and that they can preserve Evans’s claim that all a priori truths are deeply necessary. Nevertheless, there is some plausibility to the claim that ‘The local theater is a theater’, and indeed all contingent a priori truths, involve some kind of implicit or explicit reference to actuality.

Davies & Humberstone tentatively suggest that many other expressions are also A-involving, among them natural kind terms, such as ‘water’. Recall that, since water is composed of H2O molecules, ‘water’ rigidly refers to H2O. This implies that ‘Water = H2O’ is necessarily true. Since this sentence cannot be known a priori, it represents another example of a necessary a posteriori truth. Davies & Humberstone’s suggestion is that ‘water’ and other natural kind terms can be understood analogously to descriptive names. For instance, the description associated with ‘water’ could be something like ‘the actual chemical kind exemplified by the liquid that falls from clouds, flows in rivers, is colorless and odorless, …’, (compare Davies & Humberstone 1980, 18) which rigidly refers to H2O. If this is correct, then sentences containing the term ‘water’ are A-involving, and the fact that ‘Water = H2O’ is a necessary a posteriori truth can be explained by Davies & Humberstone’s account. However, Davies & Humberstone believe that ordinary proper names are not even implicitly A-involving, and thus that true identity statements involving names, such as ‘Hesperus = Phosphorus’, are both □-true and FA-true. This is in line with Evans’s view, according to which such sentences are both superficially and deeply necessary. Hence, not all necessary a posteriori sentences are given a unified treatment in the account of Evans and Davies & Humberstone.

3. Global Accounts

a. Metasemantic 2D Semantics (Stalnaker)

Robert Stalnaker (1978) introduces his 2D account as a part of a theory of assertions and their role in communication. According to Stalnaker, the contents of assertions are propositions, which he construes as intensions, that is, functions from possible worlds to extensions. Every proposition thus corresponds to a set of possible worlds, viz. those worlds with respect to which the extension is True. In a conversation, each of the participants makes certain assumptions. These speaker presuppositions are those propositions that the participants in the conversation believe to be true, or at least accept for the purposes of the conversation, and that they believe to be accepted by all the other participants in the conversation. Those speaker presuppositions that are indeed shared, and known to be shared, by all participants in a conversation constitute their common knowledge. This common knowledge is characterized by the context set—the set of those possible worlds that are not ruled out by the common knowledge of the participants in the conversation. Now if a speaker in a conversation asserts a proposition that is accepted by the hearers, then this proposition is added to their common knowledge, which means that those possible worlds not compatible with it are eliminated from the context set. On this account, the goal of communication is to reduce the context set by means of making assertions.

One problem about this very natural account of assertion and communication is that it seems unable to explain the use of certain perfectly sensible assertions. For example, there are many conceivable circumstances in which a speaker successfully communicates something by asserting ‘Hesperus = Phosphorus’. However, the proposition expressed by this utterance has a necessary intension (an intension that is constantly True in all possible worlds). Therefore, no matter what the common knowledge of the participants in such a conversation consists in, the utterance cannot eliminate any possibilities from the context set. Stalnaker thus needs to explain how an utterance of ‘Hesperus = Phosphorus’ and other utterances of this type can be informative. His explanation relies on the fact that which proposition a specific sentence expresses depends on features of the world. Suppose, for instance, as seems plausible, that if some celestial body other than Venus had been the brightest object in the evening sky (BOE), then that object would have been called ‘Hesperus’, and likewise that if some celestial body other than Venus had been the brightest object in the morning sky (BOM), then that object would have been called ‘Phosphorus’. Given this, if Mars had been the BOE and Venus the BOM, ‘Hesperus = Phosphorus’ would have expressed a different proposition that is necessarily false. In Stalnaker’s account, this dependency of the proposition expressed by an utterance on the state of the world is captured by a propositional concept. A propositional concept is a function from possible worlds to propositions or, equivalently, from pairs of possible worlds to truth-values. A propositional concept is thus a 2D-intension; it corresponds to a 2D matrix. Below is a snippet of the 2D matrix of ‘Hesperus = Phosphorus’, involving the following worlds:

w1: BOE = Venus; BOM = Venus

w2: BOE = Mars; BOM = Venus

w3: BOE = Mars; BOM = Mars


Figure 3

As was just noted, the whole matrix represents a propositional concept and thus a 2D-intension. Given that w1 is the actual world, the upper row of the matrix represents the intension actually expressed by ‘Hesperus = Phosphorus’. In 2D terminology, this horizontal intension is a 2-intension. The diagonal of the matrix running from the upper left to the bottom right is what Stalnaker calls a ‘diagonal proposition’. The diagonal proposition of ‘Hesperus = Phosphorus’, which in 2D terms is its 1-intension, is true with respect to a world if and only if the sentence expresses a true proposition in this world.

While the 2-intension, or horizontal proposition, of ‘Hesperus = Phosphorus’ is necessary, its 1-intension, or diagonal proposition, is contingent, which reflects the fact that (for all that is presupposed in a certain context) the sentence could have expressed a different, false proposition. This is crucial for Stalnaker’s explanation of the informativeness of assertions such as ‘Hesperus = Phosphorus’, because he argues that in uttering one of them, a speaker communicates the expression’s diagonal proposition. Assume, for instance, that w1, w2, and w3 are in the context set in a conversation, when the speaker utters ‘Hesperus = Phosphorus’. Interpreted according to its 2-intension—which for Stalnaker corresponds to literal interpretation—this utterance is uninformative and thus violates an important conversational rule. Moreover, a hearer who trusts this utterance knows that it is uninformative. According to Stalnaker, the utterance should thus be reinterpreted. What it really communicates is that the sentence ‘Hesperus = Phosphorus’ expresses something true. Assuming that it is common knowledge in the conversation that Hesperus is the BOE and Phosphorus the BOM, the utterance also conveys that the BOE is identical to the BOM. This content is captured by the utterance’s diagonal proposition. Hence, if the hearer accepts the speaker’s utterance, then w2, with respect to which the diagonal proposition is false, is eliminated from the context set.

There are some important differences between Stalnaker’s 2D account and the accounts considered in § 2. For a start, the accounts discussed previously were introduced to explain the behavior of specific kinds of expressions, such as indexicals (Kaplan), descriptive names (Evans; Davies & Humberstone), ‘actually’ (Evans; Davies & Humberstone), and natural kind terms (Davies & Humberstone). But since the proposition expressed by any sentence depends on the state of the world, Stalnaker’s 2D account can be sensibly applied to all kinds of sentences. Furthermore, Stalnaker (1987, 182f) stresses that his account concerns expression tokens, not types. The reason for this is that Stalnaker’s 2D account is not semantic, but metasemantic: Its 1-intension and its 2D-intension are not aspects of the meaning of expressions, but capture how their meanings depend on features of the world. And as Stalnaker notes, the latter dependency can vary between tokens of an expression type.

Stalnaker’s diagonal propositions have several further uses, for example in capturing the contents of mental states. For example, Stalnaker (1981) argues that diagonal propositions can serve to resolve puzzles raised by so-called ‘indexical’ or ‘egocentric’ beliefs, for example, ‘I am sleepy’ or ‘It is dark here’, that are essentially about the believer and her relation to the world. The following story, loosely based on a case devised by John Perry (1977, 492), illustrates one such puzzle. Suppose that Anand has lost his memory and does not remember who he is. From a book about the history of chess, he learns that Anand is the 15th world chess champion. But Anand is quite sure that he himself never even played in a world championship. Hence, Anand both believes I am not the 15th world chess champion and Anand is the 15th world chess champion. On the possible worlds account of content endorsed by Stalnaker, the former of these beliefs is true in all and only those worlds in which Anand is not the 15th world chess champion. Accordingly, the two beliefs are contradictory. But this does not seem right since, from Anand’s perspective, there is a clear sense in which his beliefs could both be true. Stalnaker’s solution is to ascribe to the subject the diagonal proposition of one of the beliefs in such cases. In the case at hand, one option is to reinterpret the belief of Anand’s that he would express by saying ‘I am not the 15th world chess champion’, such that he in fact believes not the horizontal proposition, that is, the 2-intension associated with this utterance, but rather the 1-intension associated with it, that is, the diagonal proposition. This belief is compatible with Anand being the 15th world chess champion because there are, for instance, worlds in which the amnesiac reading a book about chess history is not Anand, or has simply never played a match for the world championship. By ascribing the diagonal proposition to Anand, one can thus escape the undesirable conclusion that his belief state is inconsistent.

We saw above that Stalnaker’s 2D account can be applied to a posteriori necessities, such as ‘Hesperus = Phosphorus’. Stalnaker (2001, 155) suggests that his account can provide a general explanation for this phenomenon. This is a surprising claim, since the explanation he offers for the informativeness of ‘Hesperus = Phosphorus’ and other necessary a posteriori truths can be applied just as well to necessary a priori sentences, such as the following sentence that states Fermat’s last theorem: ‘No three positive integers a, b, and c satisfy an + bn = cn for any n greater than 2’. It is very plausible that mathematical truths, such as Fermat’s last theorem, are necessary. At the same time, it is intuitively obvious that the above sentence can be informative for a subject. On Stalnaker’s account, this is explained in the same way that the informativeness of necessary a posteriori truths is explained, by the fact that the diagonal proposition expressed by the above sentence is contingent. Intuitively, one might think that the informativeness of necessary a priori truths and that of necessary a posteriori truths are different kinds of phenomena that demand different explanations. However, whether one should consider this as a problem for Stalnaker’s account depends on one’s theoretical commitments. Stalnaker himself is skeptical about the existence of a priori truths. From his perspective, there is thus no deeper theoretical reason to provide structurally different explanations for the informativeness of, say, a statement of Fermat’s last theorem on the one hand and of ‘Hesperus = Phosphorus’ on the other.

One may have doubts that it is always adequate to ascribe diagonal propositions in cases that concern seemingly informative necessary (or necessarily false) statements or contents. For instance, is Anand’s belief, expressed by ‘I am not the 15th world chess champion’, really about the truth-value of this particular expression? Similarly, is the information a subject acquires upon hearing an utterance of ‘Hesperus = Phosphorus’ really metalinguistic? Notice, however, that the information conveyed in cases that involve the ascription of diagonal propositions need not be—at least not purely—metalinguistic. For instance, in the case discussed above, the speaker managed to convey to the hearer that the BOE is identical to the BOM by uttering ‘Hesperus = Phosphorus’. This is enabled by their common knowledge, in particular by the fact that in all the worlds in the context set, the BOE and the BOM are called ‘Hesperus’ and ‘Phosphorus’, respectively. This illustrates how diagonal contents can capture ordinary object-level information. Nevertheless, diagonal propositions do involve metalinguistic information, and their transmission in communication reflects a kind of ignorance of meaning or content on the side of the hearer. To motivate the view that such ignorance is quite common, it is useful to consider it in the context of Stalnaker’s general approach to linguistic meaning and mental content, which is externalist. The accounts of Kaplan, Evans, and Davies & Humberstone (and even more so the accounts discussed in the following section) can be interpreted as attempts to at least partially retain an internalist type of meaning or content (in the form of a 1-intension or a 2D-intension). Stalnaker rejects this interpretation of 2D semantics (2004). In his view, there is no viable internalist component of meaning or content, inter alia because he believes that one needs to appeal to features of the external world to obtain determinate content. From a purely externalist perspective, it is to be expected that even competent speakers often lack knowledge of the meaning of the expressions they use. It therefore makes sense to characterize the information they gain from utterances that express necessary propositions as (partly) metalinguistic, that is, as information about the meanings of certain expressions. But of course, this externalist viewpoint is contested. In the next section, we will consider a very different account of meaning and content, and accordingly, a very different interpretation of 2D semantics.

b. Epistemic 2D Semantics (Chalmers, Jackson)

Epistemic 2D semantics is a particularly ambitious, and also particularly controversial, theory that relies on an epistemic understanding of 1-intensions. These 1-intensions are supposed to capture the role of linguistic expressions for a subject’s reasoning and in a subject’s cognition more generally, and thus serve as the basis for a general internalist semantics. Epistemic 2D semantics also provides general explanations for the occurrence of contingent a priori truths and necessary a posteriori truths, in a way that promises to retain systematic a priori access to modality. The two main proponents of epistemic 2D semantics are David Chalmers and Frank Jackson, who have defended the account in a great number of writings (for example, Chalmers 2004; Jackson 2004).

Epistemic 2D semantics is based on the idea that there are two ways of considering a possible world: One can consider it as actual or as counterfactual. Putnam’s Twin Earth scenario serves to illustrate this distinction. Assume first that the scenario does not represent the state of our world and the planet we live on. In the actual world, the odorless, drinkable substance in our rivers and lakes is H2O, and hence the Twin Earth scenario represents a way the world is not, but could have been. Considering the Twin Earth world as counterfactual in this way lends plausibility to the view that the substance on Twin Earth is not water, because its molecular structure differs from that of the substance in our rivers and lakes. However, one can also consider Putnam’s scenario in a different way. To do this, suppose that the scenario describes the actual world. That is to say, suppose that what you have been told about the molecular structure of the odorless, drinkable substance in our rivers and lakes is wrong. The stuff that we drink every day, that comes out of our faucets, that we call ‘water’, and so forth, is really XYZ. It seems natural to say that under this assumption, one should conclude that water is XYZ. As Chalmers often puts it: If it turns out that the watery stuff (that is, the odorless, drinkable, substance in our rivers and lakes) is XYZ, then water is XYZ. In epistemic 2D semantics, these two ways of considering possible worlds are used to define two intensions:

A primary intension is a function from possible worlds considered as actual to extensions.

A secondary intension is a function from possible worlds considered as counterfactual to extensions.

The distinctive claim of epistemic 2D semantics is that every linguistic expression that is eligible for having an extension has both a primary intension and a secondary intension. Note that one can also define the epistemic version of a 2D-intension, as follows:

An epistemic 2D-intension is a function from pairs of possible worlds considered as actual and possible worlds considered as counterfactual to extensions.

Secondary intensions are closely related to the standard notion of modality—to what Evans called ‘superficial modality’ and what, following Kripke (1980), is today usually called ‘metaphysical modality’: A sentence is metaphysically necessary if and only if it has a necessary secondary intension, that is if and only if its secondary intension outputs True with respect to every world. Primary intensions, on the other hand, are closely connected to apriority: One of the key theses of epistemic 2D semantics is that a sentence is a priori if and only if it has a necessary primary intension. The worlds involved in primary intensions thus represent epistemic possibilities, that is, ways the world could be like for all one can know a priori. This thesis is based on the idea that primary intensions are a priori accessible, which can be motivated as follows. To consider a possible world as actual, one may need to bracket one’s empirical knowledge, such as one’s knowledge that the substance in our rivers and lakes is H2O. But, most importantly for epistemic 2D semantics, one does not need empirical knowledge to determine the extensions of one’s expressions with respect to worlds considered as actual. This is because any lack of empirical knowledge is ignorance of features of the actual world, and such ignorance is irrelevant if one assumes that the world one is considering is the actual world: In considering a possible world as actual, only information that is hypothetically assumed is brought to bear. The question of how the information about these possible worlds is presented to a subject, such that it is sufficient to determine the extensions of the expressions she uses, is discussed in more detail below.

Epistemic 2D semantics assigns intensions to linguistic tokens. One obvious reason for this is that, as with other kinds of 2-intensions, secondary intensions can vary between different linguistic tokens of the same type, for instance, when indexical expressions are involved. A less obvious reason is that, as we will see, primary intensions can also vary between tokens of the same type. The worlds involved in primary intensions are centered worlds. Again, this can be motivated by their usefulness in dealing with indexical expressions, in the way described in §§ 1.c and 2.a.

In what follows, the most important philosophical implications of epistemic 2D semantics will be discussed. Epistemic 2D semantics has been used to defend modal rationalism, that is, the view that we have a priori access to what is possible or necessary (compare § 3.b.i.). Another important claim made by proponents of epistemic 2D semantics is the thesis of scrutability, according to which all truths can be derived a priori from a narrowly constrained description of the world (compare § 3.b.ii.). Based on these epistemic theses, proponents of epistemic 2D semantics have argued that philosophical practice involves (or even has to involve) a central a priori element (compare § 3.b.iii.).

i. Modal Rationalism

According to epistemic 2D semantics, all expressions that are eligible for having an extension have both a primary and a secondary intension. Given the connections between primary intensions and apriority on the one hand, and between secondary intensions and metaphysical modality on the other, this implies that the 2D structures of all contingent a priori truths and of all necessary a posteriori truths can be described as follows:

(Contingent a priori) A sentence S is contingent a priori if and only if S has a necessary primary intension and a contingent secondary intension.

(Necessary a posteriori) A sentence S is necessary a posteriori if and only if S has a contingent primary intension and a necessary secondary intension.

On the standard construal, the worlds involved in primary and secondary intensions are the same, the only difference being that the worlds involved in primary intensions are centered. Any kind of divergence between epistemic and metaphysical modality can thus occur only when expressions are involved whose primary and secondary intensions yield different extensions with respect to some worlds, and this difference in extensions, in turn, must be due to the fact that it makes a difference whether the world in question is considered as actual or as counterfactual. The example of ‘water’, discussed above, suggests that this can indeed make a difference: If one considers the Twin Earth scenario as actual, the substance in its rivers and lakes falls under the extension of ‘water’, but if one considers it as counterfactual, it does not. But one might still wonder why an expression’s extension with respect to some possible world should depend on the way one considers this world. To explain this, it is helpful to reconsider ‘actually’-involving expressions, such as the following:

(4) The actual inventor of the zipper is male.

Let our world, in which Whitcomb Judson invented the zipper, be w@, and let w1 be a world in which his wife Annie invented the zipper. If one considers w@ as actual, then (4) is true with respect to it. If, however, one assumes that w@ is the actual world and thus considers w1 as a counterfactual world, then (4) is false with respect to w1, because the occurrence of the term ‘actual’ makes the truth-value of (4) depend on features of the actual world, that is, w@. Inter alia, its truth-value depends on features of Whitcomb Judson, since he invented the zipper in w@. Davies & Humberstone already suggested that the occurrence of expressions that either explicitly or implicitly refer to features of the actual world can give rise to both contingent a priori truths and necessary a posteriori truths. Epistemic 2D semantics generalizes this idea. Accordingly, all cases in which primary and secondary intensions diverge involve expressions that depend in some way on the actual world.

Since primary intensions involve the same worlds as secondary intensions, every epistemic possibility corresponds to a metaphysical possibility. Chalmers puts this as follows:

Metaphysical Plenitude: For all S, if S is epistemically possible, there is a centered metaphysically possible world that verifies S. (Chalmers 2006, 82)

(Roughly speaking, that a world w verifies an epistemic possibility S means that w makes S true provided that w is considered as actual.) Metaphysical plenitude may still seem like a surprising claim, since the existence of a posteriori necessities implies that there are epistemic possibilities that are metaphysically impossible. So how can a metaphysical impossibility, such as ‘water = XYZ’, be verified by a metaphysically possible world? The world that verifies ‘water = XYZ’ can be described as follows: It is a world in which the odorless, drinkable substance in rivers and lakes, that is, the watery stuff, is XYZ. Such a world is clearly metaphysically possible. However, if one describes it as a world in which water = XYZ, this misleadingly suggests that the world described is one in which the substance that is the watery stuff in the actual world (that is, H2O) is XYZ, due to the actuality-dependence of the term ‘water’. The latter scenario is indeed metaphysically impossible. To avoid any confusion, one should thus describe the possibility in question by using expressions that do not involve any actuality-dependence, that is, expressions whose primary and secondary intensions cannot come apart.

To sum up, epistemic 2D semantics provides the following account of our epistemic access to metaphysical modality. Both contingent a priori truths and necessary a posteriori truths are explained by the occurrence of expressions whose extension with respect to some worlds depends on whether these worlds are considered as actual or as counterfactual. Such expressions must involve some kind of explicit or implicit reference to the actual world, such that their 2-intensions vary depending on the actual world’s characteristics. This explanation of necessary a posteriori truths allows that whenever some hypothesis cannot be ruled out a priori, that is, whenever it is epistemically possible, there is a metaphysical possibility that corresponds to it. To correctly identify this metaphysical possibility, one just needs to make sure to describe it by using only expressions whose primary and secondary intensions cannot come apart. Because it postulates that we have a priori access to metaphysical modality, this account is often called ‘modal rationalism’.

Since philosophical inquiry is traditionally thought to proceed a priori, and since philosophy very often deals with sentences that are necessarily true (or necessarily false), it would be of great philosophical importance if modal rationalism could be established. However, the epistemic 2D account of modal knowledge is highly controversial. Objections to the account can be divided into two categories. Objections of the first category state that epistemic 2D semantics does not successfully explain the standard examples of a posteriori necessities, such as ‘Water = H2O’ and ‘Hesperus = Phosphorus’—or at least not all of them. It seems hard to deny that if epistemic 2D semantics accurately captures the semantics of the expressions involved, then its explanation of a posteriori necessities is compelling. The crucial question is thus whether the relevant semantic account is indeed accurate. The most pressing issue here is whether all the relevant expressions—including, for instance, proper names—really have primary intensions. This issue is discussed in the next section.

According to objections of the second category, there are other kinds of necessary truths that are different from the cases that have usually been discussed and that cannot be explained in the same way. One example sentence that has been brought up in this context is ‘God exists’. It has been argued that God exists necessarily. However, it is plausibly not a priori that God exists, and it is also plausible that the sentence does not exhibit any actuality-dependence. If all of this is correct, then ‘God does not exist’ is epistemically possible, but it is not verified by any metaphysically possible world. The example is of course highly controversial—the vast majority of philosophers do not believe that there is a necessarily existing God. But it illustrates what general form a counterexample of the type at issue would need to have. Modal rationalists have argued that there are general reasons to deny that such epistemic possibilities that do not correspond to metaphysical possibilities exist (for example, Chalmers 2009).

ii. The Scrutability of Truth

Primary and secondary intensions are defined in terms of how one considers a possible world—as actual or as counterfactual. But it is not obvious what it means to consider a possible world. Since one cannot perceive merely possible worlds, it is natural to assume that they are given to us via a description. Chalmers has explained in detail what such descriptions, which he calls “canonical descriptions”, should involve (for example, Chalmers 2006, 86–93). To begin with, a canonical description has to be complete, in the sense that it must not leave out any facts that might be relevant to the extension of some expression. However, completeness should not be achieved at the cost of triviality. Suppose, for instance, that one wonders whether a sentence S is true with respect to world w considered as actual. If S is part of the canonical description of w, then it is trivial that one can derive the extension of S a priori. According to Chalmers, the canonical description of a world should thus involve only a limited vocabulary. Furthermore, there are constraints on the kinds of expressions that may be used in a canonical description. For instance, ‘There is water in rivers and lakes’ should come out as true with respect to a world in which the only watery stuff is XYZ if that world is considered as actual, and it should come out as false with respect to such a world if that world is considered as counterfactual. However, if the canonical description of this world involves the word ‘water’, then this difference between the sentence’s primary and its secondary intension cannot be maintained, on pain of contradiction (compare Chalmers 2006, 86). Therefore, the limited vocabulary in which canonical descriptions are phrased should involve only expressions that are semantically neutral, that is, expressions whose primary and secondary intensions cannot come apart.

According to epistemic 2D semantics, primary intensions—that is, the extensions of our expressions with respect to worlds considered as actual—are a priori. Given this, the understanding of what it is to consider a world as actual in terms of canonical descriptions just explained leads to another central thesis of epistemic 2D semantics, the scrutability of truth. Here is one formulation of this thesis that concerns the actual world. (Notice that with respect to the actual world, primary and secondary intensions always yield the same extensions.)

(Scrutability of Truth) There is a description of the world in a limited and semantically neutral vocabulary from which every truth can be derived a priori.

Chalmers & Jackson (2001) make a specific proposal as to what such a description could look like. They argue that all truths follow a priori from a description they call ‘PQTI’—for Physics, Qualia, That’s all, and Indexicals. P stands for a complete microphysical description of the world, in the language of a completed future physics. Q is a complete description of the phenomenal states of all subjects, that is, of their subjective experiences. The word I adds indexical information, which picks out a subject and a time, in order to determine the truth-values of sentences such as ‘I am a chess player’ or ‘Today is Tuesday’. Finally, T is a totality clause, which states that this is all there is in the world. This clause is necessary to rule out things not entailed by PQI. Suppose, for instance, that there are no ghosts, and thus, ‘There are no ghosts’ is true. PQI does not entail the truth of this sentence, because it does not state that PQI provides a complete description of the world. The inclusion of T adds this information. Notice that, plausibly, ghosts could have existed. This illustrates that microphysical, phenomenal, indexical information plus a totality clause are insufficient to derive canonical descriptions of many other possible worlds. To describe these, one may thus need to expand one’s vocabulary.

The scrutability thesis brings out an important commitment of epistemic 2D semantics. According to this account, all of our expressions have a priori associations that are extensive enough to determine the expressions’ extensions. In fact, epistemic 2D semantics even entails that these a priori associations determine the extensions of our expressions with respect to every world considered as actual. However, there are highly influential arguments, originating in Kripke (1980), that seem to show that some kinds of expressions—most notably proper names—have no such a priori associations. The epistemic arguments suggest that everything a speaker associates with a name could turn out to be false in the light of additional empirical information—one might thus call these arguments ‘arguments from empirical defeasibility’. For instance, even our most central beliefs about Kurt Gödel—for instance, that he discovered the incompleteness of arithmetic—could be empirically defeated if, say, we got compelling evidence that the incompleteness proofs were developed by a man named ‘Schmidt’ and then later stolen and published by Gödel (Kripke 1980, 83f.). Thus, not even ‘discoverer of the incompleteness of arithmetic’ is a priori associated with the name ‘Kurt Gödel’, and it is hard to see what else could be. The epistemic arguments are compatible with the view that speaker associations determine the extension of our expressions—after all, Gödel is the unique discoverer of the incompleteness of arithmetic, even though it is not a priori that he is. However, the semantic arguments, also called ‘arguments from ignorance and error’, suggest that speaker associations need not suffice to determine the (correct) extension of an expression. Speaker associations might fail to do so in a given case either because they are insufficiently specific (in a case of ignorance), or because the speaker associations attribute features the referent does not have (in a case of error). The following examples due to Kripke exemplify these cases. First, a case of ignorance: Many people know that Feynman and Gell-Mann are physicists. But they know nothing to distinguish Gell-Mann from Feynman. Nevertheless, when these people say ‘Feynman’, they refer to Feynman, and when they say ‘Gell-Mann’, they refer to Gell-Mann. Second, a case of error: The only thing many people believe about Einstein is that he invented the atomic bomb. Nevertheless, when these people use the name ‘Einstein’, they refer not to Oppenheimer or Szilard, but to Einstein (Kripke 1980, 81). The example illustrates that the semantic arguments go one step further than the epistemic arguments: While cases like that of Gödel and Schmidt aim only to show that what speakers associate with a name could be erroneous, for all the speaker knows a priori, the case of Einstein is supposed to show that these associations are sometimes erroneous.

Similar kinds of arguments have been given concerning other kinds of expressions, such as natural kind terms (for example, Putnam 1975, 226). In order to defuse them, proponents of epistemic 2D semantics have to make a case that there are nevertheless speaker associations that are sufficient to determine an expression’s extension (including the extension with respect to other possible worlds considered as actual), and that these associations are also a priori. To do this, they have pointed out that the associations that determine primary intensions need not correspond to what first comes to a speaker’s mind. For example, in the case of a proper name such as ‘Gödel’, it is indeed plausible that ‘discoverer of the incompleteness of arithmetic’ is not part of the primary intension ordinary speakers associate with the name. Both Chalmers (2002, 617; 2003, 62–64; 2006, 91) and Jackson (1998b, 209–212; 2004, 270–271) suggest that speaker associations often concern other people’s usage of an expression. Accordingly, the primary intension of ‘Gödel’ could be approximately equivalent to the definite description ‘the individual called ‘Gödel’ by those from whom I acquired the name’. Other expressions can be treated in the same kind of way. If a speaker knows, for instance, that Gödel is called ‘Gödel’ by those from whom they acquired the name, then this proposal suffices to repudiate the arguments from ignorance and error. If the associations at issue are also a priori, then the proposal suffices to repudiate the arguments from empirical defeasibility. The claim that such associations are a priori is especially controversial, however, and will require further investigation.

Even if one accepts that linguistic expressions have primary intensions, and thus speaker associations that determine their extensions with respect to the actual world and with respect to other worlds considered as actual, one may still have doubts about the scrutability of truth. For instance, it seems extremely bold to claim that microphysical, qualitative, and indexical information, in conjunction with a ‘that’s all’ clause, is sufficient to derive a priori that Anand is a chess player—in any case, it is far beyond anyone’s cognitive capacities to perform such a derivation. Chalmers and Jackson try to explain in broad outline how certain kinds of truths, such as ordinary macroscopic truths like ‘Water covers most of the Earth’, are in principle a priori derivable from PQTI (Chalmers & Jackson 2001; Chalmers 2012). Another way of arguing for the scrutability thesis is by appealing to modal rationalism. It is plausible that the information in PQTI metaphysically determines all truths, in the sense that there is no other metaphysically possible world described by PQTI with respect to which some sentence has a different truth-value. More generally, the majority of philosophers believe that all facts are metaphysically determined by the facts in a small number of domains—most prominently, physical facts (according to physicalists), combined with some irreducible mental facts (according to dualists), and possibly some few other kinds of facts (perhaps including normative facts). Given this, if one adds to these facts indexical information and a ‘that’s all’ clause—for reasons explained above—a complete description stating these facts will metaphysically determine all truths. Now assume that some world w is considered as actual and described in terms of a semantically neutral vocabulary. If the explanation of necessary a posteriori truths provided by epistemic 2D semantics is correct and complete, then everything that is metaphysically determined by the features given the description of w is also epistemically determined, that is, a priori entailed. If one adds to this the assumption that our world has relatively few fundamental ingredients, and that one thus needs only a limited vocabulary to describe it, the scrutability of truth follows: All truths are a priori entailed by a complete description of the world in a limited and semantically neutral vocabulary.

iii. Philosophical Methodology

Epistemic 2D semantics has sparked many debates that revolve around the question whether there is an essential a priori component to philosophical inquiry. One instance of this general issue that has received particularly close attention concerns the issue of physicalism in the debate about the mind-body problem. Physicalism involves the claim that all facts about our world are metaphysically determined by the collection of physical facts. Both Chalmers (2009) and Jackson (2005) have argued that the truth of physicalism would entail that mental facts follow a priori from physical facts. Their claim is motivated by modal rationalism, which entails that metaphysical determination amounts to epistemic determination, given that the world is considered as actual and described in semantically neutral vocabulary. But many have argued that phenomenal facts are not a priori entailed by physical facts. Hence, modal rationalism seems to raise a problem for physicalism. (Notice that while Chalmers uses the epistemic 2D account to argue against physicalism, Jackson endorses physicalism and hence believes that there is an a priori entailment between physical and phenomenal facts.) Many physicalists have resisted the idea that they are committed to the existence of an a priori entailment between the physical and the mental. In making their case, they have either appealed to some special features of phenomenal concepts, or they have resisted modal rationalism on more general terms. At this point, no consensus regarding this issue is in sight.

Proponents of conceptual analysis hold that we can gain philosophical knowledge in virtue of our grasp of philosophical concepts and our understanding of philosophical expressions. Semantic externalism presents a challenge for this view, since externalism seems to imply that linguistic understanding and concept possession need not involve any significant knowledge about meaning. Now, one of the key claims of epistemic 2D semantics is that, against standard externalist views, our expressions come with a priori associations that determine the expressions’ extensions—that is to say, expressions have primary intensions. It is natural to suggest that these a priori associations can underpin the method of conceptual analysis. And indeed, Chalmers and Jackson have used insights from 2D semantics to defend this traditional philosophical method (Jackson 1998a; Chalmers & Jackson 2001). As they point out, conceptual analysis often involves thought experiments. A famous example is Edmund Gettier’s contribution to the analysis of knowledge. To undermine the claim that knowledge is justified true belief, Gettier (1963) describes two hypothetical cases in which a subject has a justified true belief, but no knowledge. If the central claims of epistemic 2D semantics are tenable, they can serve as a theoretical underpinning for this practice of doing conceptual analysis via thought experiments. If modal rationalism is correct, any hypothetical scenario that is epistemically possible corresponds to a metaphysical possibility. Furthermore, if the scrutability thesis is correct, this can explain our ability to determine the extension of expressions, such as ‘knowledge’, with respect to a hypothetical scenario.

For conceptual analysis to be fruitful as a philosophical method, the speaker associations that constitute primary intensions need to be shared with regard to at least some of those expressions that are subject to philosophical scrutiny. Otherwise, a philosopher engaged in conceptual analysis may acquire knowledge, but this knowledge will be based only on the primary intensions that this particular individual associates with the relevant expressions, and thus not be readily shareable with the philosophical community. Jackson (1998b; 2004) has argued that primary intensions are indeed usually shared among competent speakers in a linguistic community, along the following lines. Language is primarily used to convey information to others, that is, to communicate. This is possible only if linguistic expressions can be used to represent the state of the world, which requires there to be associations between words and properties. Furthermore, for speakers to be able to make use of this information, these associations between words and properties must be known to them. Jackson holds that the known associations between words and properties are primary intensions. On this view, it would be of little use if different speakers within a linguistic community ‘knew’ different associations between words and properties. Hence, primary intensions must be shared.

It seems plausible that it would be a serious hindrance for communication if the associations between words and properties differed greatly between subjects. But it is doubtful that successful communication requires perfect alignment of these associations. And indeed, Jackson concedes that often, communication can succeed if a speaker’s and a hearer’s primary intensions are sufficiently close to identical (Jackson 1998b, 214f.). However, this concession raises a worry regarding conceptual analysis. The cases discussed in conceptual analysis are often highly contrived and not very relevant to ordinary communication. Their evaluation is, however, often crucial for the evaluation of the underlying philosophical issues. Jackson’s argument thus leaves it open that there are often, or even always, divergences in the primary intensions associated by different speakers that, while irrelevant to ordinary communication, become apparent in the evaluation of philosophical thought experiments. During the early 21st century, experimental philosophers have collected a wealth of empirical data about subjects’ intuitions concerning philosophically relevant hypothetical cases. The results of these wide-ranging studies are varied and not easily summarized. In some cases, they have confirmed the philosophical consensus; in others, they have not.

4. References and Further Reading

a. Primary Sources

  • Chalmers, David. 2002. The Components of Content. In D. Chalmers (ed.), Philosophy of Mind: Classical and Contemporary Readings. Oxford: Oxford University Press, 608–633.
    • Motivates a 2D account of mental content.
  • Chalmers, David. 2003. The Nature of Narrow Content. Philosophical Issues 13(1), 46–66.
    • Argues that primary intensions can serve as a type of mental content that is determined by a subject’s intrinsic state.
  • Chalmers, David. 2004. Epistemic Two-Dimensional Semantics. Philosophical Studies 118(1–2), 153–226.
    • An extensive elaboration and defense of epistemic 2D semantics.
  • Chalmers, David. 2006. The Foundations of Two-Dimensional Semantics. In M. Garcia-Carpintero & J. Macia (eds.), Two-Dimensional Semantics: Foundations and Applications. Oxford: Oxford University Press, 55–140.
    • A survey of 2D theories, but with a focus on whether they yield a connection between 1-intensions and apriority.
  • Chalmers, David. 2009. The Two-Dimensional Argument Against Materialism. In B. McLaughlin & S. Walter (eds.), Oxford Handbook to the Philosophy of Mind. Oxford: Oxford University Press.
    • Articulates the 2D argument against physicalism in detail and gives an overview of the debate.
  • Chalmers, David. 2012. Constructing the World. Oxford: Oxford University Press.
    • A defense of various scrutability theses.
  • Chalmers, David & Frank Jackson. 2001. Conceptual Analysis and Reductive Explanation. The Philosophical Review 110(3), 315–361.
    • Argues that all truths are a priori implied by PQTI, and that physicalists should hold that phenomenal truths are a priori implied by physical truths.
  • Davies, Martin & Humberstone, Lloyd. 1981. Two Notions of Necessity. Philosophical Studies 58, 1–30.
    • Proposes to capture Evans’s distinction between deep and superficial contingency by means of the sentential operators ‘□’, ‘fixedly’, and ‘actually’.
  • Evans, Gareth. 1979. Reference and Contingency. The Monist 62, 161–189.
    • Explains the occurrence of contingent a priori truths on the basis of the distinction between deep and superficial contingency.
  • Frege, Gottlob. 1892/1952. Über Sinn und Bedeutung. Translated in P. Geach & M. Black (eds.), Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1952.
    • Introduces the distinction between sense and reference, the former of which yields a type of content that is intimately connected to epistemic notions.
  • Gettier, Edmund. 1963. Is Justified True Belief Knowledge? Analysis 23(6), 121–123.
    • Presents two counterexamples against the traditional view that knowledge is justified true belief.
  • Jackson, Frank. 1998a. From Metaphysics to Ethics: A Defence of Conceptual Analysis. Oxford: Clarendon Press.
    • Uses 2D semantics to argue that conceptual analysis plays a vital role in answering philosophical questions.
  • Jackson, Frank. 1998b. Reference and Description Revisited. Philosophical Perspectives 12, Language, Mind, and Ontology, 201–218.
    • A defense of the view that speaker associations determine reference and meaning.
  • Jackson, Frank. 2004. Why We Need A-Intensions. Philosophical Studies 118(1–2), 257–277.
    • Argues that primary intensions capture the representational content of what is communicated from speaker to hearer.
  • Jackson, Frank. 2005. The Case for A Priori Physicalism. In C. Nimtz & A. Beckermann (eds.), Philosophy–Science–Scientific Philosophy. Main Lectures and Colloquia of Gap 5, Fifth International Congress of the Society for Analytical Philosophy. Paderborn: Mentis.
    • Argues that physicalists should hold that mental facts follow a priori from physical facts.
  • Kamp, Hans. 1971. Formal Properties of ‘Now’. Theoria 37(3), 227–274.
    • An early 2D account of the indexical ‘now’.
  • Kaplan, David. 1989. Demonstratives. In J. Almog, J. Perry & H. Wettstein (eds.), Themes from Kaplan. Oxford: Oxford University Press, 481–563.
    • Develops an account of indexicals on the basis of the distinction between character and content.
  • Kripke, Saul. 1980. Naming and Necessity. Cambridge, MA: Harvard University Press.
    • Argues that proper names, indexicals, and natural kind terms are rigid designators and spells out the consequences for the epistemology of modality.
  • Lewis, David. 1979. Attitudes De Dicto and De Se. Philosophical Review 88(4), 513–543.
    • Introduces a theory of mental content that can account for the phenomenon of egocentric belief.
  • Perry, John. 1977. Frege on Demonstratives. Philosophical Review 86(4), 474–497.
    • Argues that demonstrative expressions present a problem for Frege’s philosophy of language, and suggests an alternative account.
  • Putnam, Hilary. 1975. The Meaning of ‘Meaning’. Minnesota Studies in the Philosophy of Science 7, 131–193.
    • Argues for an externalist account of meaning.
  • Stalnaker, Robert. 1978. Assertion, in P. Cole (ed.), Syntax and Semantics 9: Pragmatics. New York, NY: Academic Press, 315–332.
    • Develops an account of communication that introduces 1-intensions as a way of reinterpreting certain kinds of utterances.
  • Stalnaker, Robert. 1987. Semantics for Belief. Philosophical Topics 15, 177–190.
    • Defends a possible world account of mental content, drawing on 2D semantics.
  • Stalnaker, Robert. 2001. On Considering a Possible World as Actual. Aristotelian Society Supplementary Volume 75, 141–156.
    • Defends a metasemantic account of 2D semantics and argues that this account suggests skepticism about apriority.
  • Stalnaker, Robert. 2004. Assertion Revisited: On the Interpretation of Two-Dimensional Modal Semantics. Philosophical Studies 118 (1–2), 299–322.
    • Suggests that epistemic 2D semantics is based on an internalist view of intentionality, in contrast with metasemantic 2D semantics, and argues that internalism is untenable.

b. Secondary Sources

  • Chalmers, David. 2002b. Does Conceivability Entail Possibility? In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press, 145–200.
    • Offers a comprehensive statement and defense of modal rationalism.
  • Chalmers, David. 2002c. On Sense and Intension. Philosophical Perspectives 16, 135–182.
    • Argues that epistemic 2D semantics represents a viable account of meaning in the tradition of Frege.
  • Garcia-Carpintero, Manuel & Josep Macia (eds.). 2006. Two-Dimensional Semantics: Foundations and Applications. Oxford: Oxford University Press.
    • A collection of articles on 2D semantics.
  • Jackson, Frank. 2010. Language, Names, and Information. Oxford: Wiley-Blackwell.
    • Argues that the role of proper names in transmitting information is best explained by an account on which their meaning is given by speaker associations.
  • Kipper, Jens. 2012. A Two-Dimensionalist Guide to Conceptual Analysis. Frankfurt a.M.: Ontos.
    • Defends epistemic 2D semantics and discusses the role that conceptual analysis can play on the basis of this account.
  • Nimtz, Christian. 2017. Two-Dimensional Semantics. In B. Hale, C. Wright & A. Miller (eds.), The Blackwell Companion to the Philosophy of Language, 2nd Edition. Oxford: Blackwell, 948–969.
    • A survey of 2D semantic theories, with an emphasis on how they relate to Kripke’s semantic and metasemantic views.
  • Schroeter, Laura. 2017. Two-Dimensional Semantics. In The Stanford Encyclopedia of Philosophy (Summer 2017 Edition), Edward N. Zalta (ed.).
    • A survey of 2D semantic theories that includes an extensive elucidation of Chalmers’s 2D argument against physicalism.

 

Author Information

Jens Kipper
Email: jkipper@ur.rochester.edu
University of Rochester
U. S. A.

Health Care Ethics

Health care ethics is the field of applied ethics that is concerned with the vast array of moral decision-making situations that arise in the practice of medicine in addition to the procedures and the policies that are designed to guide such practice. Of all of the aspects of the human body, and of a human life, which are essential to one’s well-being, none is more important than one’s health. Advancements in medical knowledge and in medical technologies bring with them new and important moral issues. These issues often come about as a result of advancements in reproductive and genetic knowledge as well as innovations in reproductive and genetic technologies. Other areas of moral concern include the clinical relationship between the health care professional and the patient; biomedical and behavioral human subject research; the harvesting and transplantation of human organs; euthanasia; abortion; and the allocation of health care services. Essential to the comprehension of moral issues that arise in the context of the provision of health care is an understanding of the most important ethical principles and methods of moral decision-making that are applicable to such moral issues and that serve to guide our moral decision-making. To the degree to which moral issues concerning health care can be clarified, and thereby better understood, the quality of health care, as both practiced and received, should be qualitatively enhanced.

Table of Contents

  1. A Brief History of Health Care Ethics
  2. Methods of Moral Decision-Making
    1. Virtue Ethics: Aristotle
    2. Utilitarian Theories: Mill
    3. Deontological Theories: Kant
    4. Principlism
    5. Casuistry
    6. Feminist Ethics
    7. The Ethics of Care
  3. Ethical Principles
    1. Autonomy
    2. Beneficence
    3. Nonmaleficence
    4. Justice
  4. Ethical Issues
    1. The Health Care Professional-Patient Relationship
      1. Truth-Telling
      2. Informed Consent
      3. Confidentiality
    2. The Question of a Right to Life
      1. Human Life: Abortion
      2. Human Death: Euthanasia and Physician-Assisted Suicide
    3. Human Subject Research
      1. The Rights of Subjects
      2. Vulnerable Populations
    4. Reproductive and Genetic Technologies
      1. Reproductive Opportunities for Choice
      2. Genetic Opportunities for Choice
    5. The Allocation of Health Care Resources
      1. Organ Procurement and Transplantation
      2. The Question of Eligibility in Health Care
    6. Health Care Organization Ethics Committees
  5. Conclusion
  6. References and Further Reading

1. A Brief History of Health Care Ethics

While the term “medical care” designates the intention to identify and to understand disease states in order to be able to diagnose and treat patients who might suffer from them, the term “health care” has a broader application to include not only what is entailed by medical care but also considerations that, while not medical, nevertheless exercise a decided effect on the health status of people. Thus, not only are bacteria and viruses (which are in the purview of medicine) of concern in the practice of health care, so too are cultural, societal, economic, educational, and legislative factors to the extent to which they have an impact, positive or negative, on the health status of any of the members of one’s society. For this reason, health care workers include not only professional clinicians (for example, physicians, nurses, medical technicians, and many others) but also social workers, members of the clergy, medical facility volunteers, to name just a few, and, in an extended sense, even employers, educators, legislators, and others.

For a person to be considered healthy, in the strictest sense of the term, is for that person to exhibit a state of well-being in the absence of which are any effects of disease, illness, or injury as might concern the person’s physiological, psychological, mental, or emotional existence. It is fair to say that no one could ever achieve this level of “complete health.” Consequently, the health status of any given person, at any given time, is best understood in terms of the degree to which that person’s health status can be said to approximate this ideal standard of health.

In the preamble to the Constitution of the World Health Organization, “health” is defined as: “…a state of complete physical, mental and social well-being[,] and not merely the absence of disease or infirmity.” This definition of “health” can also be said to embrace an ideal, but it does so by representing health as a positive, rather than as a negative, concept.

Additional distinctions concerning definitions of “health” include that between what is sometimes referred to as a natural, or biological, view of health (and of disease) as contrasted with a socially constructed view. The former view entails that health, for all natural organisms (to include the biological status of human beings), is to be correlated with the degree to which the natural functions of the organism comport with its natural evolutionary design. On this interpretation, disease is to be correlated with any malfunctions, that is, any deviations of the organism’s natural functions from what would be expected given its natural evolutionary design. The adoption of this view of health by health care practitioners results in identifiable standards, or ranges, of “normalcy” concerning health care diagnostics, such as blood pressure, cholesterol levels, and so forth, the upshot of which is that any deviation from these norms is sufficient to pronounce the patient as “unhealthy,” if not as “diseased.” By contrast, the socially constructed view of health is determined by some social value(s) such that any deviation from the socially accepted norm, or average, for our species is considered to be a disease or a disability if the deviation is viewed as a disvalue, that is, as something to be avoided. For example, whether homosexuality is to be seen as a disease state, specifically, as a mental disorder, as the American Psychological Association officially held it to be for the longest time throughout the 20th century, until they reversed their position in 1980. Based on their own explanations of each of these definitional decisions, it would appear that their former official position was value-based in a way in which their latter position was a correction (Tong, 2012).

Similar distinctions concerning the concept of health, and its resultant definition, include the representation of health as “normative,” as contrasted with a “normal biological functioning” representation. Anita Silvers argues that organizations that set public health policy by their very nature incorporate (even if unconsciously) any of a number of social dimensions of health in their official definitions of “health.” Of course, to do this has practical effects that typically serve the interests of the organization in question. Any definition of “health” that uses a limited standard, and that might be appropriate for some segments of the larger human population to which the definition is being applied, but that of necessity is not reflective of some other of the segments of that same human population might render people in these latter segments of the human population as “pathological,” literally, by definition, despite the fact that with a more objective definition of “health” they would be deemed members of the healthy population.

Moreover, some such organizations implement classification systems that allow for both biological and social considerations to measure health outcomes for the purpose of determining the effectiveness of health care programs when compared to each other. Such comparisons are then used to decide, for example, what type of disease prevention measure(s) to implement or which particular sub-populations get selected for curative measures. According to Silvers, whatever the consensus in any particular society is, concerning what the word “health” designates, determines the health care services to be provided as well as the specific beneficiaries of such services. This conflation of normative and biological factors of consideration in the conceptualization and the ultimate definition of “health” by these organizations that set public health policy leads one to believe that such a definition is exclusively biological, that is, objective, and thereby to be accepted without question (Silvers, 2012).

Michael Boylan surveys a good number and variety of what he calls recent popular paradigms concerning the concept of health, as follows: 1) functional approaches to health, including “objectivism,” as associated with an “uncompromised lifespan,” and the “functionalism/dysfunctionalism” debate; 2) the public health approach to health; and 3) subjectivist approaches to health, which do not restrict themselves to physiological health but focus more broadly on human “well-being.” After demonstrating respects in which each of these approaches to our understanding of health fail, he proposes a “self-fulfillment approach” to human health. Central to this approach, and as a first-order metaethical theory, is the “personal worldview imperative,” which requires of each of us to develop a worldview that is both comprehensive and internally coherent but that is also good and one that we would strive to actualize in our daily lives. In other words, according to this imperative, such a worldview must 1) be comprehensive, 2) be internally coherent, 3) connect to a normative ethical theory, and 4) be, at a minimum, aspirational and acted upon. This personal worldview imperative is designed as an independent and objective means of assessment in order to avoid some of the inherent flaws of the well-being approach. In conjunction with what Boylan recommends as a “personal worldview of cooperation” (as a more holistic way of viewing the world), this personal worldview imperative would, arguably, constitute the most comprehensive and objective approach to our understanding of human health (Boylan, 2004 and Boylan, 2012).

Despite the fact that “health care” is a term that reflects the more recent phenomenon of the practice of health care as expanded beyond the practice of medical care, ethical concerns related to health care can be traced back to the beginnings of medical care. While this would take us back to primitive cultures at the time of the origin of human life as we know it, the first known evidence of ethical concerns in the practice of medicine in Western cultures is what has been handed down as the Corpus Hippocraticum, which is a compilation of writings by a number of authors, including a physician known as Hippocrates, over at least a few centuries, beginning in the 5th century, B.C.E., and which includes what has come to be known as the Oath of Hippocrates. According to these authors, medical care should be practiced in such a way as to diminish the severity of the suffering that illness and disease bring in their wake, and the physician should be acutely aware of the limitations concerning the practical art of medicine and refrain from any attempt to go beyond such limitations accordingly. The Oath of Hippocrates includes explicit prohibitions against both abortion and euthanasia but includes an equally explicit endorsement of an obligation of confidentiality concerning the personal information of the patient.

Additional codes of ethics concerning the practice of medicine have also come down to us: from the 1st century A.D., known as the Oath of Initiation, attributed to Caraka, an Indian physician; from (likely) the 6th century A.D., known as the Oath of Asaph, written by Asaph Judaeus, a Hebrew physician from Mesopotamia; from the 10th century A.D., known as Advice to a Physician, written by Haly Abbas (Ahwazi), a Persian physician; from the 12th century A.D, known as the “Prayer of Moses Maimonides,” Maimonides being a Jewish physician in Egypt; from the 17th century A.D., known as the Five Commandments and Ten Requirements, written by Chen Shih-kung, a Chinese physician; from the 18th century A.D, known as A Physician’s Ethical Duties, written by Mohamad Hosin Aghili, a Persian; and many more.

In 1803, Thomas Percival in England published his Medical Ethics: A Code of Institutes and Precepts, Adapted to the Professional Conduct of Physicians and Surgeons, which included professional duties on the part of physicians in private or general practice to one’s patients. The founding of the American Medical Association in 1847 was the occasion for the immediate formulation of standards for an education in medicine and for a code of ethics for practicing physicians. This Code of 1847 included not only “duties of physicians to their patients” but also “obligations of patients to their physicians,” and not only “duties of the profession to the public” but also “obligations of the public to physicians.” From the 19th century to well into the 20th century, societies or associations of medical doctors formulated and published their own codes of ethics for the practice of medicine.

A good number of medical codes of ethics were formulated and adopted by national and international medical associations during the middle part of the 20th century. In an effort to modernize the Oath of Hippocrates for practical application, in 1948 the World Medical Association adopted the Declaration of Geneva, followed the very next year by its adoption of the International Code of Medical Ethics. The former included, in addition to an enumeration of a physician’s moral obligations to one’s patients, an explicit commitment to the humanitarian goals of medicine. Since then, virtually every professional occupation that is health care-oriented in the U. S. has established at least one association for its membership and a code of professional ethics. In addition to the American Medical Association, there is the American Nurses Association, the American Hospital Association, the National Association of Social Workers, and many others.

2. Methods of Moral Decision-Making

Methods of moral decision-making are concerned, in a variety of ways, not only with moral decision-making but also with the people who make such decisions. Some such methods focus on the actions that result from the choices that are made in moral decision-making situations in order to determine which of such actions are right, or morally correct, and which of such actions are wrong, or morally incorrect. Other methods of moral decision-making concentrate on the persons who commit actions in moral decision-making situations (that is, the agents) in order to determine those whose character is good, or morally praiseworthy, and those whose character is bad, or morally condemnable. The theorists of such methods deal with such questions as: Of all of the available options in a particular moral decision-making situation, which is the morally correct one to choose?; What are the particular virtues of character that, in conjunction, constitute a good person?; Are there certain human actions that, without exception, are always morally incorrect?; What is the meaning of the language used in specific instances of moral discourse, whether practical or theoretical?; What is meant by a specific moral concept?; and many others.

What follows is a look at some of the most influential methods of moral decision-making that have been offered by proponents of such methods and that have been applied to ethical issues in the field of health care.

a. Virtue Ethics: Aristotle

While not the first of the Ancient Greeks to articulate in writing a theory of virtue ethics, Aristotle’s version of virtue ethics, as it has come down to us, has been one of the most influential versions, if not the most influential version of all. According to Aristotle, a person’s character is the determinative factor in discerning the extent to which that person is a good person. To the extent to which a person’s character is reflective of the moral virtues, to that same extent is that person a good person. Moral virtues include but would not be limited to courage, temperance, compassion, generosity, honesty, and justice. The person in whom these moral virtues are to be found as steadfast dispositions can be relied on to exhibit a good character and thereby to commit morally correct actions in moral decision-making situations. For example, a courageous soldier will neither run headlong into battle in the belief that “war is glory” nor run away from the battle in the belief that he is afraid of being injured or killed. The former soldier has chosen to be rash during the heat of battle while the latter soldier has chosen to be a coward. By contrast, the courageous soldier holds his position on the battlefield and chooses to fight when he is ordered to do so. The fundamental difference between the courageous soldier on the one hand and the rash and cowardly soldiers on the other is that, of the three, only the courageous soldier actually knows why he is on the battlefield and chooses to do his duty to defend his comrades, his country, and his family while recognizing, at the same time, the realistic possibility that he might be injured, or even killed, on the battlefield (Aristotle, 1985).

Virtue ethics is directly applicable to health care ethics in that, traditionally, health care professionals have been expected to exhibit at least some of the moral virtues, not the least of which are compassion and honesty. To the extent that the possession of such virtues is a part of one’s character, such a health care professional can be relied on to commit morally correct actions in moral decision-making situations involving the practice of health care.

b. Utilitarian Theories: Mill

The preeminent proponent of utilitarianism as an ethical theory in the 19th century was John Stuart Mill. As a normative ethical theorist, Mill articulated and defended a theory of morality that was designed to prescribe moral behavior for all of humankind. According to Mill’s utilitarian theory of morality, human actions, which are committed in moral decision-making situations, are determined to be morally correct to the extent to which they, on balance, promote more happiness (as much as possible) than unhappiness (as little as possible) for everyone who is affected by such actions. Conversely, human actions, which are committed in moral decision-making situations, are determined to be morally incorrect to the extent to which they, on balance, produce more unhappiness rather than happiness for those who are affected by such actions. Mill hastens to acknowledge that the agent in the moral decision-making situation must count oneself as no more, or less, important than anyone else in the utilitarian calculation of happiness and/or unhappiness.

However, unlike virtually all of his utilitarian predecessors, Mill offered a version of utilitarian ethics that was designed to accommodate many, if not most, of the same ethical concerns that Aristotle had expressed in his version of virtue ethics. In other words, even after it is determined that the utilitarian calculation of the ratio of happiness to unhappiness, in a particular moral decision-making situation, might result in an option that is deemed to be morally correct, an additional calculation might be in order to determine the ratio of happiness to unhappiness in the event that such an option, in future like cases, would consistently be deemed the appropriate one such that if this latter calculation would likely result in a ratio of unhappiness over happiness, then the option in the original case might be rejected (despite its having been recommended by the utilitarian calculation for the original moral decision-making situation). For example, in a moral decision-making situation in which an employed blue-collar worker witnesses a homeless person dropping a twenty-dollar bill on the sidewalk, the utilitarian calculation would recommend, as the morally correct option, to return the twenty-dollar bill to the homeless person rather than to keep it for oneself. However, given the same exact moral decision-making situation except that rather than a homeless person dropping a twenty-dollar bill on the sidewalk, the twenty-dollar bill is dropped by a universally known and easily recognizable multi-billionaire. Despite the utilitarian calculation determining that the blue-collar worker should keep the twenty-dollar bill for oneself, the additional calculation would involve the question of the likely negative effect of such an action, if repeated in a habitual way, on the agent’s own character over a period of time.

Another possible reason to reject an otherwise recommended option, based on the utilitarian calculation, would be if the same option were to be repeatedly chosen routinely by others in society, as influenced by the action in the original case in question. To the extent that the action in question, if repeated routinely by others in society, would result in unfavorable consequences for the society as a whole, that is, it would run counter to the maintenance of social utility, then the agent in the original moral decision-making situation in which this action was an option should choose to refrain from committing this action. For example, if a prominent citizen of a small town, upon learning that the local community bank was having financial problems due to an unusually bad economy decided to withdraw all of the money that he had deposited in his accounts with this bank, the utilitarian calculation would, presumably, sanction such an action. However, precisely because this man is a well-known citizen of this small town, it can be predicted, reasonably, that word of his bank withdrawal would spread throughout the town and would likely cause many, if not most, of his fellow citizens to follow suit. The problem is that if the vast majority of the townspeople did follow suit, then the bank would fail, and everyone in this town would be worse off than before. In other words, this would serve to undermine social utility, and so, the original action would not be recommended by the utilitarian calculation.

As applicable to health care ethics, utilitarian considerations have become fairly standard procedure for large percentages of health care professionals over the past several generations. It is not at all uncommon for decisions to be made, by health care professionals at all levels of health care, on the basis of what is in the best interest of a particular collectivity of patients. For example, officials at the U. S. Centers for Disease Control (CDC) learn of an outbreak of a serious, potentially fatal communicable disease. These officials decide to quarantine hundreds of people in the geographic area in which the outbreak occurred and to mandate that health care professionals across the country who diagnose patients with this same communicable disease must not only take similar measures but also must report the names and other personal information of the affected patients to the CDC. These decisions are, themselves, decisions of moral (if not also legal) decision-making, and these decisions raise additional moral issues. At any rate, the fundamental reason for taking such measures, under the specified circumstances, is for the protection of the health of the citizens in those areas where the outbreaks occurred, but, ultimately, such measures are taken for the protection of the health of American citizens in general, that is, to promote social utility (Mill, 1861).

c. Deontological Theories: Kant

A deontological normative ethical theory is one according to which human actions are evaluated in accordance with principles of obligation, or duty. The most influential of such theories is that of Immanuel Kant, whose categorical imperative, as his fundamental principle of morality, was first formulated as, “Act only on that maxim whereby you can, at the same time, will that it should become a universal law.” In application to any particular moral decision-making situation, the agent is being asked to entertain the question of whether the action that one has chosen to commit is sufficiently morally acceptable to be sanctioned by a maxim, or general principle. In other words, the agent is asked to attempt to universalize the maxim of one’s chosen action such that all rational beings would be morally allowed to commit the same action in relevantly similar circumstances. If this attempt to universalize the maxim were to result in a contradiction, such a contradiction would dictate that the maxim in question cannot be universalized; and if the maxim cannot be universalized, then one ought not to commit the action. Kant asks his reader to consider the case of a man who stands in need of a loan of money but who also knows well that he will not be able to repay such a loan in the appropriate amount of time. The maxim of his action would be: Whenever I find myself in need of a loan of money but know that I am unable to repay it, I shall deceitfully promise to repay the loan in order to obtain the money. To attempt to universalize this maxim, this man would need to entertain a future course of events in which all rational beings would also routinely attempt to act on this same maxim whenever they might find themselves in relevantly similar circumstances. However, as a rational being, this man would come to realize that this maxim could not be universalized because to attempt to do so would result in a contradiction. For, if such an action were to become a routine practice, on the part of all rational beings in relevantly similar circumstances as those in this man’s case, then those who loan money (either as loan officers for financial institutions or as private financiers themselves) would almost immediately wise up to the fact that people are routinely attempting to borrow money on deceitful promises, that is, with no intention to repay such loans. Thus, the loaning of money would, at least temporarily, come to a halt. As Kant points out to his reader, because of the contradiction involved in attempting to universalize this maxim, neither the promise (deceitful as it is) itself nor the end to be achieved by the promise (that is, the loan of money) would be realizable. So, the fact that a contradiction results from the attempt to universalize the maxim reveals the impossibility of the maxim being able to be universalized, and because the maxim cannot be universalized, then the man ought not to commit the action.

Another formulation of the same categorical imperative was formulated as, “Act in such a way that you treat humanity, whether in your own person or in the person of any other, never as a means only, but always as an end.” According to this formulation, Kant is calling attention to his belief that all rational beings are capable of exhibiting a “good will,” which he claims is the only thing in the universe that has intrinsic value, that is, inherent value, and because a good will can only be found in rational beings, they have a singular type of dignity that must always be respected. In application to any specific moral decision-making situation, the agent is being asked to respect rational beings as valuable in, and for, themselves, or as ends in themselves, and, thereby, to commit to the principle to never treat a person (either oneself or any other) as merely a means to some other end. To apply this formulation of the categorical imperative to the same example as before is to realize that, once again, one ought not to make a deceitful promise. For, to make a deceitful promise to repay a loan of money in an effort to obtain such a loan is to treat the person to whom such a promise is made as a means only to the end of obtaining the money. To be faithful to this formulation of the categorical imperative is to never commit any action that treats any person as a means only to some other end (Kant, 1989).

Deontological theories, in general, and Kant’s categorical imperative (in either of these two formulations), in particular, can be applied to any number of issues in the practice of health care. For example, if a patient who had been prescribed an opioid for only a short period of time, post-surgery, were to contemplate whether to feign the continued experience of pain during the follow-up visit with the surgeon in an effort to obtain a new prescription for the same opioid in order to abet the opioid addiction of a friend, then the patient would be attempting to treat the surgeon as a means only to another end. Because any attempt to universalize the maxim of such an action would result in a contradiction, Kant’s categorical imperative would allow one to see that such an action ought not to be taken.

d. Principlism

One approach to health care ethics was actually developed as a result of its originators’ belief that, especially, utilitarian and deontological ethical theories were inadequate to deal effectively with the issues that had arisen in medical ethics in particular. Tom Beauchamp and James Childress introduced their “four principle approach” to health care ethics, sometimes referred to as “principlism,” in the final quarter of the 20th century. Central to their approach are the following four ethical principles: 1) respect for autonomy, 2) nonmaleficence, 3) beneficence, and 4) justice. These four ethical principles, in conjunction with what are identified as moral rules and moral virtues, together with moral rights and emotions, provide a framework for what they call the “common morality.” This common morality is put forward as the array of moral norms, which are acknowledged by all people who take seriously the importance of morality, regardless of cultural distinctions and throughout human history and so are said to be universal. However, given the abstract nature of these ethical principles, it is necessary to instantiate them with sufficient content so as to be able to be practically applicable to particular cases of moral decision-making. This is what is referred to as an application of the method of specification, which is designed to restrict the range and the scope of the ethical principle in question. In addition, each ethical principle, again, in order to be practically applicable, needs to be subjected to another methodological procedure, namely that of balancing according to which the principle, as a moral norm that is competing with others, and in order to be eligible for application to a particular case of moral decision-making, needs to be deemed to be of sufficient weight or strength, as compared to its competitors (Beauchamp and Childress, 2009).

None of the four ethical principles has been designated as enjoying superiority over the others; in fact, it is explicitly acknowledged that any of the four principles can, and would, reasonably be expected to conflict with any other. Because of this, it has been pointed out that this method of moral decision-making is subject to the problem of having no means by which to adjudicate such conflicts. Moreover, to the extent that, in practice, the application of principlism can be reduced to a mere checklist of ethical considerations, it is not sufficiently nuanced to be, ultimately, effective (Gert and Clouser, 1990).

e. Casuistry

Another method of moral decision-making that explicitly rejects the use of any ethical theory or any set of ethical principles is known as “casuistry.” Although not a new method of moral decision-making, it was re-introduced by Albert Jonsen and Stephen Toulmin in the last quarter of the 20th century within the context of ethical issues in the field of health care. This method of moral decision-making is not unlike what is normally referred to in the Western system of jurisprudence as “case law,” which makes almost exclusive use of what are considered to be “precedent-setting cases” from the past in an effort to decide the present case. In other words, like the method of decision-making that is used by judges who must render decisions in the law, casuists insist that the best way in which to make decisions on specific cases as they arise in the field of health care, and which raise significant moral issues, is to use prior cases that have come to be viewed as paradigmatic, if not precedent setting, in order to serve as benchmarks for analogical reasoning concerning the new case in question. For example, if a new case were to come about in the field of health care that raised the moral issue of how the health care professionals of a hospice organization should treat a woman who is five months into her pregnancy but who also has been diagnosed with stage four pancreatic cancer and has a life expectancy between two and three months, the casuist would advise that the moral decisions concerning the treatment of this woman should be made by seeking out as large a number as possible of cases that had occurred prior to this one and that exhibited as many as possible relevantly similar salient characteristics in addition to as many as possible of the same moral issues. To render moral assessments concerning how these previous cases were handled (some more morally acceptable and others not, or even more instructive would be at least one that stands out as reflective of either decisions determined to have been obviously morally correct or decisions determined to have been blatantly morally objectionable) is to have established guideposts for moral decision-making in the present case under consideration (Jonsen and Toulmin, 1988).

According to the proponents of casuistry, normative ethical theories and ethical principles can take moral decision-making only so far because, first, the abstract nature of such theories and principles is such that they fail to adequately accommodate the particular details of the cases to which they are applied, and second, there will always be some cases that serve to confound them, either by failure of the theory or principle to be practically applicable or by suggesting an action that is found to be morally unsatisfying in some way. However, casuistry, as a method of moral decision-making, seems to make use of various sorts of moral norms or rules, if only in a subconscious or nonconscious way. For, in order to reason, analogically, from a paradigmatic or precedent-setting past case to a current case, which exhibits even a good number of relevantly similar salient characteristics and even a good number of the same moral issues, is to base one’s judgment on some norm or rule that serves as the moral standard by which to draw out the points of agreement or disagreement between the past and the present cases. Furthermore, this moral norm or rule, itself, will almost certainly turn out to have been reflective of either popular societal or cultural bias because of the conscious methodology to refrain from the use of normative ethical theories and ethical principles, both of which carry with them standards of objectivity (Beauchamp and Childress, 2009).

f. Feminist Ethics

Not unlike the proponents of casuistry, the proponents of what has come to be called “feminist ethics” shun the use of ethical theories; however, being distinctively different from traditional methodological approaches to ethics in general, and health care ethics in particular, there is a skepticism concerning traditional ethical concepts, including the concept of autonomy. In an effort to focus more particularly on issues concerning gender equality, including the social and political oppression of women as well as the suppression of women’s voices on social, political, and ethical issues, the concept of autonomy, in an abstract sense, is thought to be less meaningful for women who are socially and politically oppressed, by virtue of their gender, than for men. For example, even though, theoretically and even legally, women, by a particular point in time during the first half of the 20th century, were eligible for admission to medical schools should they have chosen to exercise their autonomous rights to apply for such admission, in practice and in fact, both the social conditioning of women and the gender bias of the men who administered medical schools, and who made decisions on which applicants would satisfy the requirements for admission, ensured that medical schools would graduate, almost exclusively, men (with only single digit exceptions in America). The point is that the concept of autonomy, in its theoretical sense, is too abstract to have had any practical application to women, in this case, whose eligibility for acceptance to medical schools was denied on the basis of gender. Rather, the social realities of the day-to-day existence of women, within their social, political, and cultural confinements, must be addressed in such a way that the specific circumstances concerning a particular woman’s relationships with other people, in all of their varieties of dependence, if not interdependence, are to be taken into account. Thus, in addition to this concept of relational autonomy, concepts of responsibility and compassion as well as those of freedom and equality are essential to the majority of the proponents of feminist ethics (Holmes and Purdy, 1992 and Sherwin, 1994).

While among those who consider themselves to be proponents of feminist ethics there exists a range of perspectives concerning not only some of the most important ethical issues within the framework of this school of thought but also concerning the very nature of this school of thought, agreement can be found in the need to reflect on both the oppression and the suppression of women that has been inherent in most every culture throughout human history.

g. The Ethics of Care

Yet another method of moral decision-making, which is sometimes thought of as a sub-field of feminist ethics but in the early 21st century has come to be seen in its own right as a methodology and was given birth by feminist ethics, is usually referred to as the ethics of care. Like the proponents of feminist ethics, the proponents of the ethics of care have decided that any methodology of moral decision-making that is based on abstract theories or principles, rights or duties, or even objective decision-making turns out to be unsatisfying in terms of interacting with others in moral decision-making situations. Instead, the focus should be, again, not unlike proponents of feminist ethics, on the specific circumstances of the personal relationships of individual people, with particular attention to be paid to compassion, sympathy/empathy, and a sincere concern for caring for others with whom one shares any intimate relationship. The upshot of this methodology is that “caring” is a necessary constituent of all moral decision-making but that it has been absent from the traditional methodologies of moral decision-making. For, traditional normative ethical theories render objectivity an essential ingredient in moral decision-making but in so doing leave no place for the care (that is, the compassion, sympathy/empathy, and kindness) that is necessary for our inter-personal relationships to be morally successful (Held, 2006).

Nursing, as a profession, has been, traditionally, a profession of the nurturing of, as well as the caring for, the patient. Until the latter part of the 20th century, nursing was also, historically, a profession for women. It should come as no surprise, then, that an “ethics of care” approach to moral decision-making would be embraced by nurses, as well as by women in other health care professions, up to, and including, the profession of medical doctors. (In no way is this to suggest that this ethics of care would, either intentionally or in practice, preclude men from identifying with it also.) In the early 21st century, this approach to patient care in medical facilities, as well as in allied health care facilities, became almost mainstream in many societies across the globe, with accrediting agencies offering their respective “seals of approval” for those medical organizations that are successful in treating patients holistically. It should go without saying that many are the health care professionals who would choose to nurture, and to care for, their own patients in this same way, with or without the existence of any such accrediting agencies (Kuhse, 1997).

3. Ethical Principles

In addition to the application of a variety of methods of moral decision-making to the practice of health care, ethical principles are also so applicable, but not procedurally in the same way as in the method of moral decision-making identified above as principlism. As concerning normative ethical theories, in particular, regardless of the particular method of moral decision-making and its moral standard for action that one might choose to apply to moral decision-making situations, and even in the absence of any such theory or standard being applied in the day-to-day practice of professionals in the field of health care, ethical principles serve to guide one’s actions in moral decision-making situations by identifying those important and relevant considerations that must be taken into account in order for one to be able to think about such situations in a serious way. In other words, ethical principles operate on a different level of moral decision-making than do normative ethical theories or other methods of moral decision-making; nonetheless, ethical principles, like normative ethical theories and these other methods of moral decision-making, are prescriptive, that is, they offer recommendations for moral action. In theory, ethical principles can be used as one measure of how effective normative ethical theories are in their application to moral decision-making situations. For, any proposed normative ethical theory that is incapable of accommodating the requirements of the most fundamental ethical principles can be called into question on that very basis.

a. Autonomy

Patient autonomy, in the clinical context, is the moral right on the part of the patient to self-determination concerning one’s own health care. Conversely, whenever a health care professional restricts, or otherwise impedes, a patient’s freedom to determine what is done, by way of therapeutic measures, to oneself, and attempts to justify such an intrusion by reasons exclusively related to the well-being, or needs, of that patient, that health care professional can be construed to have acted paternalistically. In practice, autonomy, on the part of the patient, and paternalism, on the part of the health care professional represent mutually exclusive events, that is, to the extent that one of these two is present, in decision-making and their attendant actions within the clinical relationship of the patient and the health care professional, to that same extent is the other one absent. In other words, for the health care professional to act paternalistically is for that same health care professional to have failed to respect the patient’s autonomy, and conversely, for the health care professional to respect the patient’s autonomy is for that same health care professional to have refrained from acting paternalistically.

For example, if a physician were to offer a patient only one recommendation as a remedy for a particular medical malady, when, in fact, the physician knows of more than one prospective remedy (even if the different prospective remedies would be expected to address the medical issue in question to varying degrees and/or might have distinct reputations for varying degrees of success), then the physician in question would be said to have acted paternalistically and, thereby, to have failed to respect that patient’s moral right to autonomous decision-making. In such a case, the physician might hasten to call attention to the fact that, in the typical clinical situation, the physician’s knowledge of the medical issue in question is both qualitatively and quantitatively superior to that of the patient. This fact, while not in dispute, fails to change the nature of this physician’s act of paternalism.

Some health care professionals continue to profess their own personal beliefs that patient autonomy is over-rated because, in their own clinical experience, patients continue to make poor decisions concerning what is in the best interest of their own health care. Certainly, this is a realistic concern, and it probably always will be. However, in some cases, the poor decisions, on the part of the patient who has exercised the right to self-determination concerning one’s own health care, can be explained, at least in part, by the fact that the health care professional in question has failed to engage in the necessary amount of “patient education” in an effort to ensure that better quality decisions can be made, by the patient. In too many cases, this failure, on the part of the health care professional, is due to the language in which the patient education takes place relative to the patient’s ability to comprehend language at a certain level of sophistication. That is, not every adult patient has the ability to comprehend medical explanations even if such explanations are cast in the language of the native tongue of the patient and even if the ability of comprehension that is necessary for a proper understanding is at the level of, say, an average high school graduate. The point is that a genuine respect for the patient’s right to autonomous decision-making concerning one’s own health care demands that each and every health care professional make a sincere effort to ascertain the level of language comprehension of each and every patient, and to convey, in language that is understandable by the patient, all of the relevant medical information that is necessary in order for the patient to be able to make, in consultation with the health care professional, better quality health care decisions than might otherwise be the case.

Starting in the latter part of the 20th century, and having enjoyed a sustained progression to the 21st century, has been the belief, on the part of many, if not most, health care professionals (including physicians), that the patient’s moral right to self-determination concerning their own health care is of fundamental importance to the success of the delivery of health care. This transition, from the practice of health care being extremely paternalistic, with virtually no recognition of the patient’s right to autonomy, to the practice of health care in the 21st century, and especially in Western cultures, being such that patient autonomy is respected by health care professionals in general, as being of prime importance in the clinical context, has been painstakingly incremental. However, a fundamental problem concerning this respect for the patient’s autonomy persists, and this is the problem of the inconsistency with which it is applied. Many health care professionals are the first to sing the praises of the need to respect the patient’s autonomous preferences in their own health care, however, they are all too willing to make exceptions in situations in which they, themselves, are fundamentally opposed to such an autonomous decision to be made by a particular patient. Reasons for these so-called exceptional cases vary from cultural or religious differences between the health care professional, on the one hand, and the patient, on the other, to the patient in question being a close relative, or friend, of the health care professional (even in a clinical situation in which the health care professional has no part in the practice of health care for this close relative or friend). In either of these types of cases (and many like ones), these so-called exceptional cases are not exceptional cases at all. Rather, subjective considerations have taken the place of the more objective considerations on which the health care professional in question normally acts; that is, in every such case, the health care professional is imposing one’s own personal beliefs on the patient (albeit, usually, for the patient’s own good, that is, as an act of paternalism), and thereby, is failing to actually respect the patient’s autonomy. In the final analysis, the health care professional’s respect for autonomous decision-making on the part of the patient, in order for it to be sincere and objective, does not demand its adherence only when it is convenient for the health care professional but allows for its suspension when it is inconvenient, again, for the health care professional. On the contrary, for a health care professional to respect a patient’s autonomy is to respect that patient’s autonomous goals and preferences, even if the health care professional does not agree with them. At its most fundamental level, a true respect for autonomous decision-making on the part of the patient demands that it be honored, objectively, even in the tough cases.

b. Beneficence

To act beneficently toward others is to behave in such a way as to “do good” on behalf of, or to benefit, someone other than oneself. To the extent to which health care professionals serve their patients by helping them to maintain or improve their health status, health care professionals can be said, to the same extent, to be acting beneficently toward the patients they serve. In theory, every action performed by a health care professional, in a professional relationship with a patient, can be expected to be guided by the ethical principle of beneficence. Moreover, the respect for patient autonomy and the practice of beneficent medical care can be considered to be mutually complementary. For, it is difficult to imagine a health care professional who is committed to the principle of beneficence, on behalf of one’s patients, without also respecting the right to autonomous decision-making on the part of those same patients.

However, despite the complementary nature of the ethical principle of autonomy and that of beneficence, it is not uncommon for these two ethical principles to conflict one with the other. It is possible for a patient’s autonomous preference to appear to conflict with what is in that same patient’s own best interest(s). For example, a young adult patient who has only recently suffered a ruptured appendix (such that it is still early in the progression of pain) might refuse to undergo an appendectomy for the reasons that the patient has never undergone surgery before and claims to be deathly afraid of hospitals. To respect this patient’s autonomy is for the patient to, inevitably, die, which, reasonably, is not in the patient’s own best interest. On the other hand, to coerce this patient into agreeing to the appendectomy, and thereby to prevent the patient’s death, would be to fail to respect the patient’s autonomous preference. It is also possible for a patient’s autonomous preference to appear to conflict with the best interest(s) of someone else. For example, a patient who has only recently been diagnosed with a serious sexually transmitted infection (STI) might agree to treatment for this STI only on the condition that the health care professional in question promise to refrain from telling the patient’s spouse about the STI (as the patient’s attempt to invoke the privilege of confidentiality that is considered to be inherent in the health care professional-patient relationship). To respect this patient’s autonomy is to place at risk the health status of the patient’s spouse, at the very least, regardless of whether the patient is provided treatment for this STI.

Such cases of conflict between these two ethical principles would normally be adjudicated according to which right (that is, that of autonomy or that of beneficence) can reasonably, and objectively, be determined to supersede the other in importance. In the former example, the patient, after recovering from the life-saving appendectomy, might be appreciative of the fact that the principle of beneficence was allowed to prevail over the principle of autonomy. In the latter example, the right to know, on the part of the patient’s spouse, of one’s own potential health risks involved in the patient’s having contracted the serious STI in question would allow for the principle of beneficence (concerning another rather than the patient) to take precedence over the principle of autonomy. Of course, many are the occasions on which the principle of respect for autonomy might take precedence over the principle of beneficence. Take, for example, a patient who is similar to the one in the above-mentioned case of a ruptured appendix in that the patient is, once again, deathly afraid of hospitals, but this time is elderly and has had only one surgery, although a major one. This time the surgery is recommended to remedy a leaky mitral valve in the patient’s heart. If, having had a series of bouts of patient education such that the cardiologist can, reasonably, determine that the patient is sufficiently aware of the ramifications of both options, (that is, the likelihood that the mitral valve repair would be successful and the equal likelihood that refraining from undergoing this mitral valve repair would, within a relatively short period of time, result in the patient’s death), then respect for this patient’s autonomous decision to refrain from undergoing this surgical procedure might reasonably be seen as superseding this patient’s right to beneficence, that is, to actually undergo this surgical procedure.

c. Nonmaleficence

An ethical principle that is typically traced back to the Oath of Hippocrates is to “first, do no harm,” or to refrain from engaging in any acts of maleficence in the clinical context, that is, acts that would result in harm to the patient. Acts of maleficence can be intentional or unintentional, and a large percentage of the latter kind happen as a result of either negligence or ignorance on the part of the health care professional. An example of the former would be a surgeon who fails to exercise due diligence in scrubbing prior to surgery, the result of such negligence being that the surgical patient contracts an infection. An example of the latter would be a primary care physician who fails to scrutinize sufficiently the recent medication history of a patient prior to prescribing a new medication, the result of such ignorance being that the patient suffers a new health issue due to the adverse interaction of the newly prescribed medication with a previously prescribed one that is still being taken by the patient.

Because of the intimate relationship between the principle of nonmaleficence and that of beneficence, it is possible (at least in some cases) to construe the violation of either as a violation of the other. In other words, it might be possible to construe the failure to act in such a way as to benefit someone not only as a violation of the principle of beneficence but also as a violation of the principle of nonmaleficence. Conversely, it might be possible to construe the committing of an action that, reasonably, would be expected to actually cause harm to someone, not only as a violation of the principle of nonmaleficence but also as a violation of the principle of beneficence. To leave a surgical patient under general anesthesia longer than is medically necessary would be an example of the former, and to allow surgery to be performed on a patient by a surgeon who is under the influence of drugs or alcohol to the extent that the surgeon’s skills and judgment have been seriously impaired would be an example of the latter.

Raising the question of whether the principle of nonmaleficence has been violated would also include clinical situations in which it can be determined, objectively, that the potential risks of the recommended treatment option, be it a procedure or a medication, actually outweigh the expected benefits, all things considered. To avoid this possibility, a calculation of the ratio of potential risks to expected benefits (sometimes referred to as a risk-benefit analysis) in the case of both medical procedures and the prescribing of medications is always necessary. For a health care professional to fail to render such a calculation is, at least in theory, to violate the principle of nonmaleficence.

d. Justice

In the clinical context, the ethical principle of justice dictates the extent to which the delivery of health care is provided in an equitable fashion. As such, justice is not applicable to particular decisions, or their attendant actions; rather, the principle of justice is intended to provide the guidance that is necessary to ensure that, considered in conjunction with one another, one’s decisions, and their attendant actions, are consistent each with the others. Consequently, the hallmarks of the concept of justice are fairness and impartiality. In the context of health care, the question of justice is concerned with the degree to which patients are treated in a fair and impartial manner. Justice, as an ethical principle, demands that the actions taken by health care professionals, in their professional relationships with patients, be motivated by a consistent set of standards concerning the relevance of the variety of factors that are taken into consideration for such actions. For example, the recommendation, on the part of a health care professional, of two different primary treatment options for two different patients, each of whom having presented with the exact same symptoms to approximately the same extent, and with no known other relevant differences between the two patients except for one demographic distinction (say, age, gender, or race), would, when taken together, appear to be unjust.

Of course, it is possible for a health care professional to be the subject of an unsubstantiated and erroneous charge of injustice concerning two, or more, clinical cases that might appear to be relevantly the same. Typically, the reason for such an accusation, should the accusation be inaccurate, is that the accuser is lacking the requisite knowledge of the cases in question in order to be able to determine that, although these two, or more, cases do, indeed, appear to be relevantly similar, in fact, they are not. For example, a physician assistant might prescribe two different antibiotics (one of which has been proven to be highly effective but the other of which has an inconsistent success rate, each for the same medical malady) to two different patients who have been diagnosed with the medical malady in question. Learning of these facts, someone might accuse the physician assistant of being unfair, that is, unjust, in the treatment of these two patients. However, what this accuser does not know is that the patient for whom the less effective antibiotic was prescribed is deathly allergic (that is, subject to anaphylactic shock) to the antibiotic with the higher success rate.

In the final analysis, the ethical principle of justice demands that cases, which are relevantly similar, be treated the same and that cases, which are relevantly different, be treated in appropriately distinct ways in recognition of such differences.

4. Ethical Issues

 The practice of every profession reveals ethical issues that are endemic to the professional field in question. The practice of health care is no different. What follows is a look at some of the most pervasive ethical issues that are encountered in the practice of health care.

a. The Health Care Professional-Patient Relationship

Any ethical issues that can arise within the clinical relationship between the health care professional and the patient are of the utmost importance if only because this relationship represents the front line of the provision of health care. The most important part of this relationship is trust on the part of each of the participants in this relationship. This is why the issues of truth-telling, informed consent, and confidentiality are essential to the success of any relationship between a patient and a health care professional.

i. Truth-Telling

The most important value of telling the truth is that, under ordinary circumstances, the recipient of a claim, offered by someone else, has reasonable expectations that the claim is true, and for that reason, will, more often than not, adopt such a claim (it is to be hoped only after subjecting it to sufficient scrutiny), incorporate it into one’s own belief system, and eventually act on it. To act on this formerly received claim, which, subsequently, has become one’s own belief, is to engage in autonomous decision-making. However, should it turn out that such a belief is objectively inaccurate because the claim (from which this belief was derived) was not true, then the person who is acting on this belief will have had one’s own capacity for autonomous decision-making compromised. True, or genuine, autonomous decision-making is possible only if the beliefs on which such decisions are made are accurate; in other words, any decision that is based on an inaccurate belief (even if the belief is not recognized as such), cannot be a true autonomous decision. Thus, every person can be said to be under a moral obligation to tell the truth, especially on topics the claims about which are important and relevant to the lives of their recipients. For, in such cases, the recipients of such claims, who choose to accept them, will, eventually, hold them as beliefs, and will act on them in order to pursue what they take to be interests of their own, and, perhaps, too, the interests of others.

To respect another person, as a person, is to respect that other person’s right to autonomous decision-making, especially when such decisions concern their own interests that bear, in important and relevant ways, on the quality of their own lives. For, the quality of one’s life is a pre-requisite for human happiness, and of the entire range of interests that one might identify as essential to one’s own happiness, good health is arguably the most fundamental. Not only can the moral right to autonomy be said to be the most important right of a patient, in a clinical setting, it also can be said to be the foundational right for all of the other rights that a patient can be said to have. In order for a patient to be able to protect one’s own interest in promoting, or regaining, one’s own health, that patient’s moral right to autonomy demands to be respected.

To the extent to which any health care professional, in a professional relationship with a patient, fails to be honest with a patient (concerning that patient’s diagnosis, the recommended treatment options, the identification of realistic potential risks and expected benefits associated with such treatment options, or the patient’s prognosis by virtue of the diagnosis in relationship to each of the recommended treatment options), that patient’s autonomy can be said to have been compromised. If this compromised autonomy were to result in the patient’s inability to protect one’s own interest in promoting, or regaining, one’s own health, then this failure to be honest with the patient would represent a moral failure on the part of the health care professional. For example, a physician who, when asked explicitly by a patient what the potential adverse side-effects of the medication that the physician is in the process of prescribing might be, and who responds in such a way as either to play down the number and severity of such adverse side-effects or to suggest that there are none, can reasonably be considered to have failed one’s patient by having been dishonest. Any attempt, on the part of the physician, to justify such deception as an act of beneficence toward the patient is doomed to failure because, by definition, such deception, resulting from such a motive, would constitute an act of paternalism, that is, an act that would disregard the patient’s right to autonomous decision-making.

ii. Informed Consent

Concerns about patient autonomy give rise to the concept of “informed consent.” For, if one believes that the patient, indeed, does have a moral right to self-determination concerning one’s own health care, then it would seem to follow that health care professionals, especially physicians, ought not to prescribe any therapeutic measure in the absence of the patient’s informed consent.

Informed consent is intended to be not only a moral but also a legal safeguard for the respect of the patient’s autonomy. Furthermore, informed consent is designed to promote the welfare of the patient (that is, to ensure the patient’s right to beneficence) and to avoid the causing of any harm to the patient (that is, to ensure the patient’s right to nonmaleficence). In the clinical context, informed consent is a reference to a patient’s agreement to, and approval of, any recommended treatment or procedure that is intended to be of therapeutic value to the patient but only on the condition that the patient has an adequate understanding of all of the most important and relevant information concerning the treatment or procedure in question.

Typically, the concept of “informed consent” arises in the context of a patient (or either a patient advocate or a patient surrogate) who asserts a right to informed consent; it is usually articulated as the patient’s “right to know” any, and all, relevant information in the therapeutic relationship (usually) with the physician. A patient enters a therapeutic relationship with a physician either in an effort to maintain one’s current status of optimal health (perhaps, with an annual visit for a physiological examination in conjunction with a series of laboratory, or other, diagnostic tests) or in an effort to regain the lost status of optimal health that the patient might have previously enjoyed. To fail to respect the patient’s right to informed consent, by refraining to provide any specific important and relevant information to the patient, is to fail to uphold either the principle of beneficence or the principle of nonmaleficence, if not both.

For example, a physician might choose to knowingly, and intentionally, refrain from informing a patient of the potential risks of a certain procedure that has been recommended, up to and including a realistic risk of death. Other examples would include specific anesthetics that have a risk, small though it might be, of causing the death of the patient. To genuinely respect the patient’s right to informed consent in cases like these would be for the physician to fully inform the patient of such risks and to inform the patient, too, of the most recent statistics on how probable such risks might be. This would provide the patient with the opportunity to make a more informed decision in consultation with the physician.

Consequently, for informed consent to be truly meaningful, from the patient’s perspective, not only does the physician have an obligation to provide any and all important and relevant information concerning recommended treatments and procedures but also an obligation to refrain from interfering, without justification, with the patient’s ultimate decision.

Julian Savulescu and Richard W. Momeyer argue, effectively, that not only does being insufficiently informed of relevant information restrict a patient’s autonomous decision-making, so too does the holding of irrational beliefs, which could result in irrational deliberation. To illustrate this point, they choose the case of a patient who is a Jehovah’s Witness and who, on grounds of religious beliefs, refuses a prospective life-saving blood transfusion. They argue that, rather than viewing such a case as one in which the health care professional ought to exercise deference to the patient’s right to autonomous decision-making, out of respect for a patient whose value system differs from one’s own, the health care professional has a moral obligation to attempt, as best one can, to inform the patient of all of the important details that are relevant to the patient’s current health care situation, but also to spend the time that is necessary to help guide the patient through a process of rational deliberation concerning those details in an effort to make the best possible treatment decision. To attempt to accomplish both of these tasks is to demonstrate respect for the patient’s right to autonomous decision-making in a way in which to merely address the former task is not. Savulescu and Momeyer recognize, and advise against, the exercise of paternalism, if not coercion, when it comes to both the providing of important and relevant information and the guiding of the patient through a process of, theoretically, rational deliberation because, as they say, to compel the patient either to accept medically justified information or to engage in practical rational deliberation concerning such information would be counter-productive in many respects (Savulescu and Momeyer, 1997 and Savulescu 1995).

In the case of any non-emergency medical procedure of any significance, there is a moral obligation to obtain the informed consent of the patient by written signature authorization of an informed consent document. In the case of any emergency medical procedure of any significance, there is a moral obligation to make every reasonable effort to obtain the informed consent of the patient, in like manner. Failing that (for example, due to the mental incapacity, or incompetence, of the patient), every reasonable effort should be made to obtain the informed consent, in like manner, of either a patient surrogate (if the patient has a durable power of attorney for health care decisions) or a patient advocate (in the absence of such an advance directive). Only in cases of an emergency medical procedure of any significance in which the nature of the illness, or injury, of the patient is such that proper treatment requires urgent medical attention, in addition to which it is not possible (again, due to the mental incapacity, or incompetence, of the patient) to obtain the written signature authorization of the patient, and there is insufficient time to secure the written signature authorization of either a patient surrogate or a patient advocate, would it be morally justified to proceed with such a medical procedure in the absence of any written signature authorization.

Adolescent patients represent a special case in that while, in many cases, the cognitive ability of the adolescent patient is sufficient to comprehend most, if not all, of the important and relevant information concerning their own health care needs as well as the recommended options for treatment, normally, they are not recognized as competent medical decision-makers in the law. To accommodate both of these facts, and in addition to the written signature authorization by a parent or guardian, every reasonable effort should be made to inform adolescent patients of all of the important and relevant information concerning their own health care needs and the recommended treatment options, including the approved one, in order to obtain their assent to the latter. An exception to this is the case of emancipated minors, that is, minors who are in the military, married, pregnant, already a parent, self-supporting, or who have been declared to be emancipated by a court; emancipated minors, in most legal jurisdictions, are granted the same legal standing as adults for health care decision-making.

iii. Confidentiality

There is a moral obligation to protect from dissemination any and all personal information, of any type, that has been obtained on the patient by any and all health care professionals at any medical facility. The justification for the protection of this right is integral to the very provision of health care itself. It is essential that there exist a relationship of trust between the patient and any health care professional. This is so because there is a direct correlation between the trust that a patient places in a health care professional to keep in confidence any and all information of a personal nature that surfaces within the context of their clinical relationship and the extent to which that patient can be expected to be forthcoming with full and accurate information about oneself, which is necessary in order for the proper diagnosis and treatment of the patient to even be possible. In fact, the absence of such trust, either well-founded or not, in the mind of a person who is considering whether to enter a patient-health care professional relationship can be sufficient to keep that person from entering such a relationship at all.

Adding to the concern that a patient in any medical facility has, with respect to the extent to which personal information about oneself can reasonably be expected to be kept in confidence, is the number of employees of such a facility (especially a hospital) who have access to such information. Even limiting the number of such employees to those who need access to such information in order to properly perform their own medical duties, and even allowing for relevant distinctions between, for example, small community hospitals in rural areas and large metropolitan medical centers that serve as “teaching hospitals” for medical schools, there are literally dozens of people who have such legitimate access. For example, it is not atypical for the personal information on a surgical patient in a hospital to be accessed by attending physicians as well as physicians who are specialists and who serve as case consultants, nurses (for example, in the operating room, in the post-anesthesia care unit, in a step-down unit, on a medical-surgical floor, and perhaps, in other clinical areas), therapists (respiratory, physical, and other types), laboratory technicians (of a variety of kinds), dieticians, pharmacists, and others, including, but not limited to, patient chart reviewers (for example, for quality assurance), and health insurance auditors. Eventually, a point is reached at which the very concept of “confidentiality” either no longer applies or loses any meaning that it might have originally had. Moreover, the greater the number of people who have access to the personal information on a patient, the greater is the possibility that such information might be compromised in any of a number of ways.

In order for the respect of the patient’s moral right to the confidential maintenance of personal information in the clinical setting to have any real credibility and in order to ensure that the patient receive the best possible quality of health care, there is a moral responsibility on the part of any and all health care professionals to exercise the utmost care in the handling of the personal information on the patient such that the access to, and the use of, such information is strictly limited to what is necessary for the proper medical care of the patient. Furthermore, patients, themselves, have the right to request access to their own medical records in any medical facility (including medical offices as well as hospitals and long-term care facilities) and should be allowed (to the extent to which it is reasonably possible) a voice in who else has access to such information. To allow the patient this kind of input in one’s own medical care can foster, in any of a number of ways, the relationship of trust between the patient and the various health care professionals that is necessary for the proper medical care of the patient. (Confidentiality rights for patients in America received a comprehensive make-over with the implementation, in 2003, of the Health Insurance Portability and Accountability Act (HIPAA).)

Despite the fact that the patient’s moral right to confidentiality, concerning personal information, is of the utmost importance and despite the fact that the physician-patient relationship has traditionally enjoyed a privileged status, even in the law, there is at least one exception to this moral right: the oral or written expression of the intention, in a serious and credible way, on the part of the patient, to harm another. Such a communication imposes on the health care professional not only a moral, but also a legal, obligation to notify the proper authorities. In such a case, the right of another to not be harmed supersedes the otherwise obligatory moral right to confidentiality on the part of the patient. (The Supreme Court of the State of California decision in the Tarasoff v. Regents of the University of California case (1976) held that mental health professionals have a legal obligation to warn anyone who is threatened, in a serious way, by a patient.)

Another possible exception to the patient’s moral right to confidentiality is to be found within the context of the policies and programs of public health organizations. Given that the primary goals of such organizations are to foster and to protect the health of the members of entire populations, or societies, of people, the fundamental means by which to accomplish these goals are policies and programs the intent of which is either to prevent illness and injury or to provide health care services. In their efforts to prevent illness, public health policies sometimes come into conflict with a patient’s moral right to confidentiality. For example, a person’s right to know, for reasons of self-protection, that one’s spouse has contracted a sexually transmitted infection, by virtue of this spouse’s extra-marital relationship with one, or more, other sexual partners, might be given precedence (on moral, if not legal, grounds) over this spouse’s moral right to confidentiality, which normally would be protected within the physician-patient relationship (in this case, the same physician-patient relationship in which this sexually transmitted infection was discovered). Depending on the severity of the particular type of sexually transmitted infection, and the degree to which it is wide-spread in the population in question, the fact that this spouse has contracted this particular sexually transmitted infection might reasonably be not only a matter of individual concern but also, properly, a public health matter.

b. The Question of a Right to Life

Of all of the ethical issues that can be encountered in the practice of health care, none has been more controversial than those of abortion, euthanasia, and physician-assisted suicide. Despite the debates that are waged, with an abundance of passion concerning the specific moral aspects of each of these ethical issues, a reasoned analysis of each of these ethical issues might be expected to provide new opportunities for a better appreciation of the complexities of each.

i. Human Life: Abortion

At least since the time of the Oath of Hippocrates, with its explicit prohibition against abortion, there have been admonishments against the practice of the aborting of a human fetus together with arguments on both sides of this issue. Abortion is a perennial moral issue in most societies that ebbs and flows in its importance as an issue that serves to inform, if not incite, social debate and social action. However, over the late 20th century and early 21st century in America, stark differences between the opinions on each fundamental side of this issue have been voiced by people in the society at large, as compared to the reasoned debates waged by philosophers as a result of their attempts to bring clarity to the relevant moral issues, to the concepts that are inherent in such issues, and to the language that is used to express such issues and concepts. Historically, some theologians and some legal theorists have made moral and legal distinctions, respectively, that are relevant to the practice of abortion based on the concept of “quickening,” that is, the point in time (usually 16 to 20 weeks after conception) during a pregnancy at which the expectant mother is first able to discern fetal movement in the womb, and on the concept of “viability,” that is, the stage of development of the fetus (usually taken to be 24 weeks into the pregnancy) after which the fetus is expected to be able to survive outside of the womb (despite the likelihood of under-developed body organs and physiological, if not also mental disabilities).

The U. S. Supreme Court decision in the case of Roe v. Wade (1973) upheld a woman’s legal right to an abortion in accordance with the “due process” and “equal protection” clauses of the Fourteenth Amendment of the U. S. Constitution, rendering illegal any outside attempts to the contrary (usually by state governments), during the initial trimester of the pregnancy, but allowing state governments to limit, although not prohibit, a woman’s decision to have an abortion during the second trimester of the pregnancy. From the end of the second trimester to the time of delivery, that is, after viability, state governments were granted the authority not only to limit but also to prohibit abortions.

Despite the fact that those who adopt what are usually referred to as conservative positions and those who adopt what are usually referred to as liberal positions on the issue of abortion sometimes take the same position on related moral issues, for example, that murder is morally unacceptable and that people have a moral right to their own lives, many disagree, fundamentally, on the question of whether the act of abortion is also an act of murder and on the question of whether a fetus has a right to life. Since the Roe v. Wade landmark decision, most of the theoretical ethical debates have attempted to address each of these issues by focusing on the concept of “personhood,” as central to this debate.

Mary Anne Warren, in an influential essay in which she responds to many of the significant arguments in the literature to that point in time, makes an important distinction between what it is to be a human being as compared to what it is to be a person. According to Warren, the classic argument against abortion relies on a logical argument that depends on the fallacy of equivocation in order to attempt to be successful. The argument is as follows: since it is morally incorrect to kill innocent human beings, and since fetuses are innocent human beings, then it follows that it is morally incorrect to kill fetuses. Warren points out that the proponent of this argument is equivocating on the term “human being.” For, in its occurrence in the initial premise, “human being” is intended to mean something like “a full-fledged member of the moral community,” that is, the moral sense of the term “human being,” but, in its occurrence in the second premise, “human being” is intended to mean something like “a member of the species, Homo sapiens,” that is, the genetic sense of the term “human being.” Because the term “human being” shifts its meaning from its occurrence in the initial premise to its occurrence in the second premise, the conclusion, in fact, fails to follow from its premises; in other words, because the proponent of this argument is guilty of the fallacy of equivocation, this argument (which in order to succeed would need a different term in the place of “human being,” the meaning of which would be preserved in both of its occurrences) fails.

Warren argues that “moral humanity” and “genetic humanity” are not synonymous in meaning because the membership of these two classes is not the same. In other words, persons are viable candidates to be “full-fledged members of the moral community” in a way in which human beings are not. Consequently, the moral community consists of all, but only, persons. She then entertains the question concerning what characteristics an entity must have in order to be considered a person and launches a search for what might constitute the criteria necessary for personhood. In the final analysis, she identifies five such criteria, which she offers as “most central to the concept of personhood,” as follows: “1) consciousness (of objects and events external and/or internal to the being), and in particular the capacity to feel pain; 2) reasoning (the developed capacity to solve new and relatively complex problems); 3) self-motivated activity (activity which is relatively independent of either genetic or direct external control); 4) the capacity to communicate, by whatever means, messages of an indefinite variety of types, that is, not just with an indefinite number of possible contents, but on indefinitely many possible topics; and 5) the presence of self-concepts, and self-awareness, either individual, or racial, or both” (Warren, 1973). Warren acknowledges that it should not be required of an entity that it must exhibit all five criteria in order to qualify as a person, nor should any particular one of these criteria be deemed necessary for personhood. However, she does identify the first two criteria, followed closely by the third, as the most important. Finally, she insists that any entity that fails to exhibit any of these five criteria is, definitely, not a person, and that a human fetus is just such an entity.

Yet another argument against the right of a woman to have an abortion stems from the claim that, even if it can be demonstrated that a fetus is not, strictly speaking, a person, a human fetus is, after all, “potentially” a person. That is, if a fetus is allowed to develop, over the course of a normal pregnancy, its potential to become a person becomes more and more likely the closer that it gets to its time of delivery. The question is whether this potentiality for personhood should be considered to guarantee the fetus some rights akin to the rights of a person, for example, a right to life. Warren takes up this issue and concludes that while the fact that the human fetus is a potential person, which, on moral grounds, might entail that women ought not to wantonly have abortions, in the final analysis, whenever the question comes down to the right to life of the fetus as opposed to the right of a woman to have an abortion, the right of the woman must always supersede the claimed right on behalf of the fetus because the rights of actual persons always outweigh the rights of potential persons.

Don Marquis takes on the question of the morality of abortion in a way that is separate and apart from any considerations of whether a fetus can be a determined to be a person and even whether a fetus can be considered to be potentially a person. Rather, Marquis’s argument is an attempt to avoid the logical pitfalls of each of these other types of arguments. According to Marquis, the one factor that allows us to consider the taking of a human life to be morally objectionable is that to do so is to take away that individual’s life experiences, activities, projects, and enjoyments, which, had that individual’s life not have been taken away, would have constituted that individual’s future personal life, all of which (that is, one’s experiences, activities, projects, and enjoyments) would have been either intrinsically valuable or, at least, valuable as means to ends, such ends being intrinsically valuable to that individual. To take a human life is to deprive the individual of both what one values at present but also what one would have come to value over time had one been allowed to live on, that is, to deprive one of all of the value that one’s future continued life had promised, a future that now will not exist. It is, says Marquis, this loss that makes the taking of a human life morally incorrect. This argument against the taking of a human life would apply not only to adults but also to young children and babies who, arguably, also have a future of value concerning life experiences, activities, projects, and enjoyments to which to look forward. In the same way, a human fetus has a similar future such that, if aborted, would never be able to come to pass (Marquis, 1989).

An obvious criticism of this argument concerning the moral status of abortion is that Marquis’s argument suggests, at least, that the reason he identifies to support the claim that the taking of a human life, in the case of human adults, young children, babies, and even fetuses, is morally incorrect is, if not the only reason, then at least far and away the most important reason to support this claim. However, this is to minimize the importance of other such reasons, the plausibility of which also seems likely, such as the varying degrees of emotional pain and grief as suffered by the friends and loved ones of the victim and the denigrating effects on the perpetrator’s character, if only in terms of a desensitization to the value of human life itself. Finally, and notwithstanding the concept of personhood, Marquis’s argument, again, at least suggests that the prospective future of a human fetus is, if not identical to, then on a fundamental par with that of not only a baby or a young child but also a human adult. However, surely, there are relevant differences, not the least of which would be the capacity, more so for a young child than for a baby and more so for an adult than for a young child, to envision and to have anticipatory thoughts about one’s own prospective future and the value that it might hold, a capacity that, in theory, a fetus just does not have.

At least since the Roe v. Wade U. S. Supreme Court decision, the spectrum of positions on the issue of the moral status of abortion has been represented by an extreme conservative position, namely, that, without any exception, abortions of human fetuses ought never to be allowed; by an extreme liberal position, namely, that abortions of human fetuses ought always to be allowed, and for any reason whatsoever; and by more moderate positions, like, for example, that abortions of human fetuses ought not to be allowed, in general, but ought to be allowed in cases in which the following circumstances serve as the exceptions: in cases in which pregnancies have occurred as a result of the act of rape or the act of incest, or in cases in which the life of the expectant mother is seriously jeopardized by the pregnancy itself.

Although the Roe v. Wade U.S. Supreme Court decision was settled precedent for the previous half century, on June 24, 2022 the U.S. Supreme Court handed down its opinion in the Dobbs v. Jackson Women’s Health Organization case, which explicitly overturned both Roe v. Wade. and Planned Parenthood of Southeastern Pennsylvania v. Casey (1992), the latter of which had upheld a woman’s right to access an abortion, in effect reaffirming Roe v. Wade.  The 6-3 vote in the Dobbs v. Jackson Women’s Health Organization case held that “the [U.S.] Constitution does not confer a right to abortion. Roe and Casey must be overruled, and the authority to regulate abortion must be returned to the people and their elected representatives.”  In effect, this mandates that the individual states have the authority to regulate abortion for “legitimate reasons,” up to and including the banning of abortion with or without any exceptions.  As yet another landmark decision, Dobbs v. Jackson Women’s Health Organization has proven to be at least as controversial as Roe v. Wade ever was.

It is likely that people, in societies throughout the world, will continue to stake out positions on this issue as influenced by their cultural and/or religious beliefs, by the beliefs of their ancestors and/or living relatives, by their own ignorance or knowledge on the subject, and for all other manner of reasons, but it is unlikely that the spectrum of positions on the issue of the moral status of abortion will change.

ii. Human Death: Euthanasia and Physician-Assisted Suicide

Euthanasia is an intervention in the standard medical course of treatment of a patient who is reasonably considered to be terminally, or irreversibly, ill or injured for the express purpose of causing the imminent death of that patient, normally for reasons of mercy.

Whenever a patient who is competent to make health care decisions for oneself and who, under no coercion from anyone else, makes an explicit request (oral or written) to be euthanized, the case in question is one of “voluntary euthanasia.” Moreover, whenever a patient is not competent to make health care decisions for oneself but on behalf of whom an advance directive has been properly provided, one that was properly executed by the patient prior to becoming incompetent to make health care decisions for oneself and that explicitly expresses (in the case of a living will) or explicitly authorizes a surrogate to express (in the case of a durable power of attorney for health care decisions) the request to be euthanized under certain specified conditions, and these conditions are present, the case in question is also one of “voluntary euthanasia.”

Whenever a patient who is not competent to make health care decisions for oneself and on behalf of whom no advance directive has been properly provided but for whom a patient advocate (that is, a close relative whose decision-making authority is recognized in the law or, failing that, a more distant relative or friend) makes an explicit request (oral or written) that the patient in question be euthanized, the case in question is one of “non-voluntary euthanasia.”

Whenever a patient who is competent to make health care decisions for oneself but for whom someone other than the patient makes the decision that the patient be euthanized and does so without the consent of the patient (either because the patient was never consulted on the matter or because the patient was consulted but chose not to give consent), the case in question is one of “involuntary euthanasia.” While neither voluntary nor non-voluntary euthanasia presents any moral concerns, by its very nature, it is impossible to imagine a situation in which involuntary euthanasia could ever be morally justifiable.

When an instance of euthanasia takes the form of the committing of an action, it is usually referred to as “active euthanasia;” when such an instance takes the form of refraining from the committing of an action, it is usually referred to as “passive euthanasia.” The administering of a lethal injection would be an example of the former; the withholding of a regular course of medical treatment in order for a fatal injury, illness, or disease to take its natural toll would be an example of the latter. This distinction between active and passive euthanasia has been, historically, the focal point of the most controversy concerning the practice of euthanasia.

Traditionally, all health care-related professional codes of ethics find passive euthanasia to be morally allowable but active euthanasia to be tantamount to murder; the relevant laws in all of the legal jurisdictions in America follow suit. However, an argument can be made that terminally ill or injured patients ought to be allowed, both morally and legally, to decide when one’s own life should end and whether it should be an instance of active or passive euthanasia; the justification for such allowances would be out of a true respect for the right of such patients to self-determination concerning not only their own health care, but also the duration of their own lives as well as the means by which their lives are to end, which would be an instance of a true respect for the autonomy of such patients. Indeed, an additional argument can be advanced in an effort to uphold the patient’s right to beneficent health care. That is, in an effort to attempt to “do good” on behalf of, or to benefit, a terminally ill or injured patient, once again, one could argue that such patients should be allowed to decide their own fate and the means by which to achieve their chosen fate, that is, by the method of either active or passive euthanasia.

James Rachels, in a famous article on this very question (Rachels, 1975), attempts to demonstrate that this controversy represents a distinction without a difference. That is, Rachels argues that there are, indeed, no relevant moral differences between active and passive euthanasia, and that, in order to be consistent in one’s thinking, one has to acknowledge that active and passive euthanasia are either both morally allowable or both morally condemnable. William Nesbitt argues that Rachels fails to prove that the ordinary interpretation of responses to the two agents in Rachels’s famous comparative examples would be the same, which is the heart of the case that Rachels sets forth (Nesbitt, 1995 and Callahan, 1989).

Related to the topic of active euthanasia is what has come to be known as “the doctrine of double effect.” This doctrine has a long and rich history in the doctrine of the Roman Catholic Church but has only been applied to cases of terminally ill patients in the early 21st century. In its application to patients with terminal diagnoses who receive palliative care, the doctrine of double effect is typically invoked in an effort to justify on moral (if not legal) grounds the commission of an action by a medical professional the intention of which is to relieve the patient’s usually excruciating physiological pain while being fully cognizant of the likely, but unintended, consequence of causing the death of the patient. For example, a cancer patient, with a prognosis of only a matter of days to live, continues on a regimen of the sedative lorazepam and the opioid morphine. With increasing frequency, the patient has complained of the worsening of the pain and has repeatedly requested ever-higher doses of the morphine drip. In response to each of these requests, the physician has complied, knowing full well that there will be a threshold beyond which the dosage of morphine will be sufficient (in conjunction with a myriad of other causal factors that are idiosyncratic to this patient) to kill the patient. This, then, comes to pass. If asked by a nurse on this case whether anyone was culpable for the patient’s death, the physician would, typically, reply that no one was so culpable because, even with the final increase in the dosage of morphine, the intention was not to kill the patient; rather, the intention was to alleviate the patient’s pain.

The myriad of other causal factors that can, mutually, hasten such a patient’s death include (but would not be limited to) the patient’s body weight, the status of the patient’s immune system, the effects of the progression of the cancer, the effects of other medications, and whether the patient is still receiving nutrition and hydration. The key factor in the doctrine of double effect is the intention on the part of the medical professional in question. As long as the action in question is deemed a good one, the intention was the beneficial effect (alleviating the patient’s pain) rather than the harmful effect (killing the patient), the beneficial effect stemmed from the action directly rather than as a result of the harmful effect, and the beneficial effect outweighed, in importance, that of the harmful effect, then the action in question is determined to have been morally (if not also legally) allowable by the doctrine of double effect. However, the most fundamental criticism of the application of the doctrine of double effect to such cases is that there is no relevant moral distinction between the action in question and an instance of active euthanasia.

Palliative sedation, as the monitored use of medications, including sedatives and opioids, among others, to provide relief from otherwise unmitigated and excruciating physiological, among other types of, pain or distress by inducing any of a number of degrees of unconsciousness, can be similarly problematic depending on whether and to what extent the pain or distress of the patient in question is managed appropriately. If managed well, palliative sedation need not be a causal factor in hastening the death of the patient; however, if it is not managed well, in theory, palliative care can be such a causal factor.

If “suicide” were to be understood as one’s pursuit of a plan of action the effect of which is expected to be the intentional premature death of oneself, then “assisted suicide” can be understood to be one’s pursuit of a plan of action the effect of which is expected to be the intentional premature death of oneself but the effect of which, in order to be successful, needs to be facilitated in some way, shape, or form by someone else. If that someone else were to be a physician, then it would constitute a case of “physician-assisted suicide.” Public attention was brought to bear on the issue of physician-assisted suicide in America by Dr. Jack Kevorkian who, throughout the final decade of the 20th century, as a retired pathologist, offered to help fatally ill patients to end their lives prematurely. Prior to his fifth, and final, prosecution, which was for second degree murder, and for which he was convicted (having avoided this fate the first four times), he claimed to have assisted approximately 130 patients to end their lives, which he had claimed, throughout his entire medical career, that patients ought to have a right (both morally and legally) to do. Despite the fact that all health care-related professional codes of ethics have consistently, and still do, condemn physician-assisted suicide, currently, at least five of the fifty states in America have legalized physician-assisted suicide. Among those European nations that had legalized both active euthanasia and physician-assisted suicide by the early 21st century, the Netherlands has led the way (Kevorkian, 1991).

c. Human Subject Research

Theoretically, the most fundamental reason to conduct research involving human subjects is to add to our existing knowledge concerning the physiological and the psychological constitution of the human body and the human mind, respectively, in an effort to improve the quality of life of people as determined by the status of their bodily and mental health. Thus, the principle of beneficence should lie at the heart of all research that is conducted with human subjects. The history of such research is one of major achievements, typically incremental and over time, each of which has played a part in the extension of not only the duration of human life but also the quality of the day-to-day existence of members of the human race, virtually all over the planet. However, many are the moral issues that have arisen due to the mistreatment to which many such human subjects have been subjected, and which have occurred in any of a number of important ways, from physiological abuse to mental and emotional abuse to the abuse of human rights. The history of human subject research is replete with examples of such abuses. By the middle of the 20th century, enough people in sufficiently important roles in Western societies began to codify what they took to be some of the most basic moral rights that would need to be respected in order for human subject research to be recognized as morally acceptable.

i. The Rights of Subjects

Over many decades throughout the second half of the 20th century, a variety of codes of ethics were developed for the protection of the rights of people who serve as human research subjects. In virtually every case, those codes, that were of the most importance, were formulated in response to specific cases of human subject research during the course of which at least some of the people who served as participants had some of their fundamental rights abused. A few examples follow.

The Nuremberg Code (1949) was formulated in response to experiments that were performed on people who were members of demographic groups that were targeted for extinction by Hitler in Nazi Germany and that were conducted by medical doctors and biomedical researchers some of whom had little to no expertise or experience in either the practice of medicine or the conducting of biomedical research. In the judgment of those who prosecuted two dozen of these experimenters in what came to be known as the “Doctor Trials,” held in Nuremberg after the more famous Nuremberg trials in which the Third Reich’s major suspected war criminals were prosecuted, the main charge for which the defendants were tried was the murderous and torturous human experiments that were conducted in many of the concentration camps and the prisoner of war prisons. Of the ten principles in the Code, the emphasis, in general, was on the need for biomedical researchers to obtain the voluntary informed consent of the prospective human subjects prior to the commencement of any such experimentation. The second most important right of human subjects of such research to be emphasized in the Code was the human subject’s right to protect oneself by determining whether, and when, it is one’s own interest to end one’s own participation in such an experiment, without fear of any penalty or punishment. Despite having no legal force, The Nuremberg Code has had profound effects on the ethics of human experimentation and has spawned a good number of other such codes since its formulation.

The Declaration of Helsinki (1964, and with multiple revised versions since) was adopted by the World Medical Association’s World Medical Assembly with the title, “Recommendations Guiding Medical Doctors in Biomedical Research Involving Human Subjects.” This code of ethics consists of a host of recommendations, the result of which is the establishment of the following moral principles: 1) a competence requirement for research investigators, 2) a requirement that the significance and importance of any expected positive outcomes of the research outweigh any anticipated risks to the human subjects, 3) a requirement of informed consent on the part of the human subjects, and 4) a requirement for the external review of all of the research protocols.

The National Research Act (1974) created the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research and did so in direct response to the infamous “Tuskegee Syphilis Study” (1932-1972), which was a study of approximately 400 African-American male share-croppers, each of whom suffered from this most serious of venereal diseases, the stated purpose of which was to attempt to ascertain whether there were any significant differences between the progression of syphilis in African-American men as compared to Caucasian men. The participants in this study, begun during the throes of the Great Depression and in one of the economically poorest regions of America, were promised free food and free medical care for their participation. However, rather than being informed of the venereal disease from which they suffered, they were told only that they had “bad blood.” Most of these men were married and continued to have conjugal relations with their wives and to produce children (many of whom, wives and newborn babies, were infected with syphilis). Worse, even after penicillin was discovered and approved as modern medicine’s first antibiotic (and found to be effective against a variety of bacterial infections in humans, including syphilis, by the late 1940s), not only were these men never informed about this “miracle cure,” the health care professionals who were conducting this study, knowingly and intentionally, refrained from administering any penicillin to any of this study’s participants (Brandt, 1978).

The Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects of Research (1979) was generated by the above-mentioned commission and identified boundaries between the practice of routine medical care as compared to biomedical research protocols (again, as a direct result of the “Tuskegee Syphilis Study”), identified moral guidelines for the process by which research subjects are selected and for their informed consent, and emphasized the moral principle of respect for research subjects as persons as well as the ethical principles of beneficence and justice in the treatment of human subjects.

The Public Health Services Act (1985) established and mandated that every research facility in America that conducts either biomedical or behavioral research on human subjects have an Institutional Review Board (IRB) for the protection of the rights of human research subjects. This requirement for each such research institution (academic or otherwise) to have IRB approval for each and every biomedical or behavioral research study was a result of many instances of research protocols that, for a variety of reasons, were thought, at least in retrospect, to have violated the human rights of their human participants. For example, the “Stanford Prison Experiment” (1971) was a behavioral study, the purpose of which was to identify and analyze the psychological effects of the relationship between prison guards and prisoners on members of each group, but which took on a life of its own and resulted in a good number of human rights violations. As for biomedical research, the famous case of Henrietta Lacks and her HeLa cells allowed for at least dozens and dozens of medical breakthroughs in the curing of diseases in the latter half of the twentieth century, making large amounts of money for some people and some institutions in the research process, while most of her descendants, including some of her own children, lived their entire lives without health insurance, some of whom were, even if temporarily, homeless. Only recently has attention been brought to her story, and to this situation, by her biographer (Skloot, 2010).

The composition of the membership of all Institutional Review Boards (IRBs) is mandated to be reflective of diversity with respect to gender, race, and culture or heritage as well as a diversity of social experiences and an appreciation for issues (relevant to the research involving human subjects) that reflect the standards and values of society, if not also of the local community. The fundamental goal of all IRBs is to determine the acceptability of all research proposals, involving human subjects, based on the extent to which such proposals adhere to all relevant federal, state, and local laws, the research institution’s own policies and regulations, and all relevant standards of professional conduct, as mandated by the federal government. Moreover, IRBs are obligated to ensure that all proper procedures are followed for the voluntary informed consent of all of the subjects of all research projects.

In addition to enforcing stringent standards in order to ensure that the consent of prospective human participants be truly informed, IRBs are mandated to enforce equally strict standards concerning the following: that potential risks as well as expected benefits of the research protocols are made clear to prospective participants; that information of a personal nature that is obtained on research participants is kept in strict confidence; and that any research participant who is, simultaneously, a patient (whether in a medical facility or not) under medical treatment, is made sufficiently aware of the differences between those practices that are a part of one’s medical treatment as compared to those practices that are a part of the research protocol. In other words, researchers, in such situations, are morally obligated to exercise what sometimes might constitute supererogatory measures in an effort to help the research participant to be aware of which procedures that one is subjected to are a part of one’s medical treatment and which procedures that one is subjected to are a part of the research study, which might or might not be expected to be of therapeutic value.

The moral issues that have arisen, over decades, concerning human subjects in both biomedical and behavioral research are many and varied. In biomedical research, such issues include the exclusion of the members of specific demographic groups from even being considered to be eligible to become participants in such research. For example, until the latter part of the 20th century in America, biomedical research on breast cancer was almost nonexistent. Not until women, in decent numbers, had entered the field of medicine and the field of biomedical research did research proposals into various aspects of breast cancer begin to compete for funding with research proposals into various aspects of prostate cancer. Furthermore, even biomedical research into, for example, the correlative, if not causal, factors involved in heart disease solicited only Caucasian males as prospective research participants. In response to what some viewed as unjust funding priorities and unfair funding criteria, was the National Institutes of Health (NIH) Revitalization Act of 1993, which mandates that women and members of minority groups be included in all research that is funded by the NIH unless there is a “clear and compelling” reason that their inclusion in such research is “inappropriate” with respect to the health of the prospective subjects themselves or the purpose(s) of the research. Examples of appropriate exclusionary practices would be biomedical research into testicular cancer, which would properly exclude women, just as biomedical research into sickle-cell anemia would properly exclude Caucasians.

One of the most popularly known moral issues concerning both biomedical and behavioral research is the use of placebos. The classic case of the use of placebos is the clinical drug trial, in which researchers are attempting to determine, first, the effectiveness of the experimental drug, and second, the extent to which potential adverse side-effects of the experimental drug are significant, if not fatal. Typically, the study includes two groups of participants: those to whom is administered the experimental drug and those to whom is administered a placebo (popularly known as a “sugar pill” due to the fact that it is designed to have no relevant affect, at all, on the research participant to whom it is administered). In order to attempt to ensure credibility concerning the use of a placebo, the participants in both groups are intentionally deceived as to which group of participants is receiving the experimental drug and which is receiving the placebo. To attempt to ensure even more credibility concerning the use of a placebo, the researchers orchestrate not only a blind study, as just mentioned, but a double-blind study, in which in addition to the researchers withholding from the participants of each group the knowledge of which group’s participants are receiving the experimental drug and which are receiving the placebo, neither do the researchers themselves know this information. The main reason for a blind study is to attempt to avoid any possibility of what we might refer to as suggestive bias on the part of the participant concerning the possible effectiveness of the experimental drug. The main reason for a double-blind study is to attempt to avoid any possibility of what we might call expectation bias on the part of the researchers themselves concerning either the effectiveness, or the lack thereof, of the experimental drug.

The use of placebos in biomedical or behavioral research does raise questions concerning the ethical principle of beneficence in addition to the moral right to be told the truth. First, in theory, the participants in many, if not most, clinical trials, including drug trials, have reasonable expectations of benefitting in any of a number of ways from their participation in such research. At least in cases in which such a participant is, simultaneously, a patient with a terminal illness who ends up in the placebo designated group, it would appear that the right to beneficent treatment is being thwarted. In such a situation, and by the nature of the case, such a participant would be, perhaps, literally, betting one’s life on, in this case, the experimental drug. Second, to the extent to which participants in human subject research are being deceived, knowingly and intentionally by the researchers, which is a necessary part of any research study involving the use of placebos, a case can be made that the moral right to be told the truth, on the part of the research participant, has been violated (regardless of whether such participants are also, simultaneously, patients who are receiving medical treatment). Of course, the response to either of these criticisms of research protocols that make use of placebos is that the participants agree to the use of placebos and know, full well and in advance, that they have an equal opportunity to be members of the group who receive the placebo or members of the group who do not.

ii. Vulnerable Populations

By the nature of the case, there are some groups of people in society who are especially susceptible to abuse, concerning their rights, whenever they are the subjects of human research. Such vulnerable populations are as follows: babies, including neonates (as well as human fetuses and the subjects of human in vitro fertilization, at least in theory); children; pregnant women; prison inmates; undergraduate and graduate students; the members of any demographic minority group; and anyone who is cognitively challenged, physiologically challenged, educationally disadvantaged, economically disadvantaged, significantly compromised in one’s health, terminally ill, injured, or disadvantaged in any other relevant way.

Of particular concern in the recruitment of human research subjects, especially in cases involving prospective participants who are known to be vulnerable in any important and relevant respect(s), is the issue of coercion, whether explicit or implicit. Notwithstanding the initial one, people in every category, above-enumerated, as groups of people who represent vulnerable populations, would be susceptible, for a variety of reasons, to the influence of coercion by recruiters for human subject research. Whenever possible, biomedical and behavioral researchers should refrain from even attempting to recruit, as a prospective participant, anyone who is reasonably identifiable as a member of any vulnerable population. In the event that a biomedical or behavioral researcher needs to recruit any such vulnerable prospective participants (by virtue of the nature of the research itself), the researcher has a moral obligation to be aware of the likelihood that the prospective participants in question will feel coerced (either explicitly or implicitly, and whether they are aware of it or not) to “voluntarily” consent to participate in the research project in question. In such a situation, the researcher is morally obligated to engage in supererogatory efforts to attempt to minimize, as best one can, the effects of the coercion involved.

Once recruited, the most fundamental concern of the biomedical or behavioral researcher is the need to ensure, as best one can, that the participant (as a member of a vulnerable population) is as fully informed as possible, with respect to all relevant information concerning the proposed research project and the participant’s role in it, in an effort to approximate, once again, as best one can, truly informed consent on the part of the participant. The main reason for this concern is that any particular research participant, who is vulnerable in any important and relevant respect(s), might find it difficult, if not impossible, to comprehend any, much less all, of the relevant information concerning the proposed research project and one’s own role in it, for any of a number of reasons, for example, insufficient comprehension abilities, insufficient familiarity with the language spoken by the researchers, inadequate cognitive abilities, chronic pain of such intensity as to inhibit one’s cognitive processes in the case of a research participant who is also a patient with at least one acute health issue, and more.

d. Reproductive and Genetic Technologies

Throughout the history of the practice of health care, the acquisition of knowledge and the innovation of medical technologies have brought with them new moral issues. Beginning in the last quarter of the 20th century and continuing into the 21st century, advancements in knowledge and technologies concerning human reproduction and human genetics have spawned whole new types of moral questions and moral issues, many of which involve even more complexities than the previous ones.

i. Reproductive Opportunities for Choice

The last quarter of the 20th century brought with it major advances in biological knowledge and in biological technology that allowed, for the first time in human history, for the birth of human offspring to result from biological interventions in the birthing process. For those whose ability to procreate was biologically compromised, new scientific methods were developed to facilitate success in the birthing process. Such methods include artificial insemination (AI), in vitro fertilization (IVF), and surrogate motherhood (SM).

Artificial insemination is the process by which the sperm is manually inserted inside of the uterus during ovulation. In vitro fertilization is the process of uniting the sperm with the egg in a petri dish rather than allowing this process to take place in utero, that is, in the uterus. To increase the probability of success, multiple embryos are transferred to the uterus. As a result, multiple pregnancies are not uncommon. These multiple pregnancies increase the probability of premature births, which usually result in low-birth weight, under-developed organs, and other health issues. As to the embryos that are not chosen for transfer, the normal practice is to freeze them for possible future use because the success rate for any given round of IVF is only approximately 1 in 3.

Many opponents of IVF focus on the probability of the resultant health issues; in other words, to bring into the world, in a contrived way, children who stand a reasonable chance of suffering any of a number of health problems is unfair to such children (Cohen, 1996), if not also to the society into which they are born. Others disagree and argue that to be the recipient of the gift of life would more than outweigh the usual health issues that might result from IVF (Robertson, 1983). Some commentators argue that reproductive technologies, such as AI and IVF, allow women the opportunity to realize their potential for autonomous decision-making when it comes to their own reproductive preferences (Robertson, 1994 and Warren, 1988). Another criticism is the likelihood that the children, so produced, will be viewed as, somehow, inferior to children who are born as a result of the traditional process of procreation. There are also moral issues concerning frozen embryos. First, the longer that an embryo is maintained in a frozen state, the more likely it is that it will become degraded to the extent that either it is no longer capable of being used for its intended purpose or it is no longer alive. Second, there are serious questions as to what the fate of these frozen embryos should be when, for example, because of the splitting up of the relationship of the biological parents or the death of one, or both, of these parents, such embryos are left in a state of limbo. Should they be used for scientific research, should they be offered to other people, whose compromised procreative abilities dictate a need for such embryos to be brought to fruition through the process of IVF, or should such embryos merely be discarded?

Surrogate motherhood is the process by which one woman carries to term a fetus for someone else (typically a couple). The surrogate mother is impregnated by the method of either AI (traditional surrogacy, according to which the surrogate mother’s egg is fertilized) or IVF (gestational surrogacy, according to which an embryo is transferred to the uterus of the surrogate mother). Not only in the former case (in which the surrogate mother is also the genetic mother) but also in the latter case (in which the surrogate mother is not the genetic mother), one of the most important moral, if not also legal, issues has always been whether the surrogate mother has any proprietary rights to the newborn baby, regardless of whether a legal contract applies and regardless of whether any money changes hands.

Another fundamental moral issue occurs in cases in which there is a contractual relationship as a legal guarantee for a financial agreement. Such cases raise the moral issue of whether fetuses and newborn babies should be treated as commodities, and indeed, whether the womb of the surrogate mother should be rented out as a service for someone else, that is, also treated as a mere commodity (Anderson, 1990). However, not all commentators on this subject agree that surrogate motherhood can, of necessity, be reduced to the crass practice of baby selling or that women who serve as surrogate mothers are, necessarily, exploited. On the contrary, it can be argued that women who serve as surrogate mothers are willing to forgo any parental right that they might have to begin, much less to maintain, an inter-personal relationship with the babies they deliver. In the same way in which this forgoing of any parental right to engage in any type of inter-personal relationship with the baby appears to not be offensive in cases of surrogate motherhood, when engaged in for altruistic reasons, consistency would seem to demand that no such offense should enter into the situation just because an exchange of money is involved; in other words, the motive is not relevant to the moral assessment of the process of surrogate motherhood (Purdy, 1989).

Artificial insemination, in vitro fertilization, and surrogate motherhood have been defended on the ground that the right to reproductive freedom, including the right to exercise one’s autonomy concerning procreation, allows for any such means to bring children into the world.

Cloning is the asexual reproduction of an organism from another that serves as its progenitor but that is genetically identical to its progenitor. Cloning has always been a natural process of reproduction for many bacteria, plants, and even some insects, and it has been used as an intervention in the reproduction of plants for hundreds of years. However, since the successful cloning of a sheep named Dolly in 1996, major moral concerns have been voiced concerning the ability of scientists to clone, not only other animals, but also human beings. Despite some claims to the contrary, none of which has ever been verified, the cloning of human beings is not yet feasible.

The purpose of therapeutic cloning is to create an embryo, the stem cells of which are identical to its donor cell and are able to be used in scientific research in order to better understand some diseases, from which can be derived treatments for such diseases. The same moral issues concerning the use and ultimate fate of human embryos, as aforementioned, apply to these cloned human embryos.

The purpose of reproductive cloning is to create an embryo, which if brought to fruition will become a member of the animal kingdom. In the successful attempts to clone a variety of animals to date, a consistent problem has been health issues related to significant defects in major organs, including the heart and the brain; in addition, the duration of the lives of these cloned animals has been, on average, only half of the number of years of the normal life expectancy of such species. Moreover, each successful attempt to clone these animals has been preceded by literally dozens, if not hundreds, of unsuccessful attempts. These same problems would represent major moral concerns in any attempt to clone human beings. However, were any such attempt to be successful and were the resultant cloned human being to be of sufficiently good health to lead anything like a normal existence, new moral issues would arise. Would such cloned human beings be viewed as second class members of the human race? Would they be deprived, either socially or legally, of some of the fundamental freedoms that are normally afforded people, for example, the right to exercise one’s own autonomy? Would cloned human beings have been robbed of the exact same uniqueness (in terms of their physiology, their personality characteristics, and their character traits) that every human being in the history of humankind has hitherto enjoyed? (Just because a cloned human being would be identical, genetically, to its progenitor does not mean, by virtue of its idiosyncratic experiences in utero and in life in a large number and variety of ways, that it would, of necessity, have exactly the same life as its progenitor) (National Academy of Sciences, 2002). This last point notwithstanding, would cloned human beings be denied rights to their own identity (Brock, 1998)?

Any scientific researcher who has aspirations to clone a human being would be well advised to read, carefully, Mary Shelley’s Frankenstein; or, the Modern Prometheus. Published in 1818, this work of science fiction leaves the reader with the not too subtle warning that one ought to keep one’s hubris in check; for to create anything, much less an artificial man, is, almost certainly, to fail to be willing, or, perhaps, to be unable, to anticipate many of the important untoward consequences of one’s actions, and equally problematic, to over-estimate one’s ability to exercise control over one’s own creation.

ii. Genetic Opportunities for Choice

Since the discovery of the molecular structure of deoxyribonucleic acid (DNA), the molecule that contains the genetic instructions that are necessary for all living organisms to develop and to reproduce, in 1953, and since the completion of the mapping of the human genome, popularly known as the Human Genome Project, that is, the identification of the complete and exact sequencing of the billions of elements that make up the DNA code of the human body, some fifty years later, a vast amount of research has been conducted in the area of disease-causing mutations as causes of many human genetic disorders. This research has also allowed for the creation of literally thousands of genetic tests, the purpose of which is to detect, both in the case of prospective parents and at the fetal stage of the development of human offspring, those genetic mutations that are responsible, in part or in whole, for many non-fatal and fatal conditions and diseases. Furthermore, this research has allowed for the editing of human genes, in an effort to proactively disable some genetic mutations, in the case of adults, children, and newborns as well as in the fetal stage of development. The information derived from genetic testing, more often than not, is anything but definitive; in other words, the results of the vast majority of genetic tests are predictive of the probability that the disease or condition for which the testing was done will actually bear out. Whether such probabilities are low, moderate, or high, many other factors, especially environmental ones, can also be contributing factors. Further, while many genetic tests are available for the detection of conditions and diseases for which there is, at present, a cure, many other genetic tests are able to be conducted for conditions and diseases for which there are no cures. This fact raises the obvious question of whether specific individuals do or do not want to know that there is a probability, to whatever degree, that they will fall victim to a particular condition or disease for which there is no cure.

Each of the advances in genetic knowledge, genetic technologies, and biomedical capabilities concerning genetics brings in its train its own set of moral concerns. Genetic disorders such as amyotrophic lateral sclerosis (ALS, popularly known as Lou Gehrig’s Disease), a motor neuron disease, which is always fatal, can be familial, that is, one who has inherited the gene mutation for ALS has a 50% chance of passing the mutated gene on to any of their offspring. However, one who inherits the mutated gene might or might not fall victim to the ravages of the disease. It is conceivable that an individual, who has begun to exhibit some of the early symptoms of ALS, might choose to be tested for any of the four gene mutations that are thought to be causal. If such testing reveals the presence of one or more such mutations, and if this individual has children, the moral issue of whether any such children should be informed, immediately, and if they are so informed, the moral issue of whether such children should choose, themselves, to be tested, both become of paramount importance, if only because, depending on the outcome of the genetic testing of these children, the fate of any of their children (already in existence or as future possibilities) would be a concern.

Another moral issue that continues to arise in the context of genetic testing is when an adult or a child is tested for one condition or disease and a mutated gene is discovered for another potentially fatal condition or disease. This situation can occur because much genetic testing, at present, is sufficiently broad in its application as to include a variety of different genes. So, it sometimes happens that genetic testing for a toddler, for example, for one, or more, genetic mutations (which are suspected due to the presence of specific relevant symptoms) might reveal one or more other genetic mutations for conditions, diseases, or even specific cancers, or for young adult-onset cardiomyopathy, about which neither the researcher nor the pediatrician was even concerned. In such a case, questions arise as to whether such health risks (again, not anticipated but discovered by the genetic tests) for the toddler should be shared with the toddler’s parents, and if so, when should they be shared, that is, immediately or when the toddler is older (and if when the toddler is older, at what age). If it is not known whether the offensive gene mutations are inherited or are merely spontaneous (which is a common occurrence), does the timing of informing the toddler’s parents become a moral issue, in the event that the toddler’s parents might expect to bring additional children into the world? And, what about the toddler: from the perspective of the pediatrician or the parents, at what age should the toddler be so informed (Wachbroit, 1996)?

The moral issues identified, concerning each of these two hypothetical situations, are reflective of the ethical issues that are most fundamental in health care, namely, cases of conflict involving the ethical principles of respect for the patient’s right to autonomous decision-making as compared to acts of paternalism on the part of health care professionals and as compared to the patient’s right to beneficence in one’s relationship with health care professionals.

In addition to therapeutic reasons for genetics research and its application to health care, there are non-therapeutic reasons for such research and applications, for example, genetic enhancement, that is, the application of genetic knowledge and technologies to improve any of a number of physiological, mental, or emotional human characteristics. Some commentators argue that genetic enhancement, as compared to genetic therapy, is morally objectionable for a number of reasons, not the least of which is that, in a free-market economic system in which genetic enhancement is not provided to each citizen who might choose it by the state, those who could afford to pay for it would have a decided advantage over those who could not (Glannon, 2001). Other commentators do not agree, arguing that any attempt to use gene therapy to cure any type of human dysfunction is, in no way, morally different from any attempt to use gene therapy to enhance human function in cases in which such enhancements serve to protect one’s health or life (Harris, 1993).

Julian Savulescu goes even further by arguing for what he calls “procreative beneficence,” which is that anyone who is making use of genetic testing for non-disease human traits should make selections in favor of a child, from among other available selections in favor of other possible children, who can be expected to have, based on all of the available genetic information, what he calls “the best life,” that is, “the life with the most well-being,” or a life that would be at least as good as the lives that any of the other possible children would be expected to have. For, according to Savulescu, some non-disease-related genes influence the probability of one’s leading the best life; there is good reason to use information, which is at our disposal and which concerns such genes; and one should select embryos or fetuses which, in accordance with the available genetic information (including such information concerning non-disease genes), have the best opportunity for leading to the best life. He does make clear that, consistent with the moral requirement to make selections in favor of the child who can be expected to have the best life, those individuals who are making such selections may be subjected to persuasion but ought not to be subjected to any coercion (Savulescu, 2001).

Stoller contends that Savulescu fails to make his case because the examples that he offers to be, ostensibly, analogous to pre-implantation genetic diagnosis (PGD), a procedure that is used to screen IVF-created embryos for genetic disorders or diseases prior to their implantation, are different in ways that are morally relevant and consequently fail to justify his theory (Stoller, 2008).

Stem cell research, since its inception, has been the subject of much controversy. The pluripotent qualities of embryonic stem cells, that is, their ability to differentiate or to be converted into the cells that make up any of the human body’s parts, render them superior to adult stem cells when it comes to their use in genetic therapeutic research. Hence, many of the same reasons, as above-mentioned, that constitute moral issues whenever embryos are used for research purposes apply to the use of embryonic stem cells. This is despite the fact that they hold out much promise in their application to minimize the negative effects of, if not cure, many previously incurable conditions and diseases, for example, coronary disease, diabetes, Parkinson’s disease, Alzheimer’s disease, spinal cord injuries, and many others.

As genetic research progresses to the point at which gene therapy is able to make use of not only somatic-cell therapy (that is, the modification of genes in the cells of any of a number of human body parts for therapeutic reasons) but also germ-line therapy (that is, the alteration of egg cells, sperm cells, and zygotes for therapeutic reasons), the health care applications are expected to increase in number in an exponential way. However, the most important moral concern that the prospect of being able and willing to eventually engage in germ-line therapy is that this type of gene modification, by its very nature, will affect an unknown number of people in the future as they inherit these genetic changes. By contrast, somatic-cell therapy can only affect the person whose genes are so modified.

e. The Allocation of Health Care Resources

Health care resources have never been unlimited in any society, regardless of the type of health care system that was employed. At least for the foreseeable future, this fact is unlikely to change, but it is this fact that necessitates some form of what is normally referred to as the rationing of health care resources. Health care resources include not only the availability of in-patient hospital (and other medical facility) beds, emergency room beds, surgical units, specialized surgical units, specialized treatment centers, diagnostic technology, and more, but also personnel resources, that is, health care professionals of every description.

Whenever the availability of health care resources is exceeded by the demand for health care resources, the financial costs of such resources will rise; to the extent that, historically, there has been a consistent progression of the demand for such resources exceeding their availability, the financial costs of health care have also, consistently, risen. Because there are many other causal factors for this financial phenomenon, the rise in the financial costs of health care has been consistently exponential, in many countries, since the latter part of the 20th century. By the nature of the case, this occurs to a greater extent, and at a more rapid pace, in any country the politicians and public policy makers for which decide to employ a health care system that does not provide universal coverage.

i. Organ Procurement and Transplantation

The procurement of human organs for transplantation in order to save the lives of those who otherwise would not survive represents what many consider to be a modern medical miracle, which became possible only in the latter half of the 20th century. However, like all such advances in medical knowledge and in medical technologies, human organ transplantation raises some fundamental moral issues. Throughout the brief history of human organ transplantation, a problem that is expected to continue is the fact that there are many, many more people who need organ transplants in order to survive than there are human organs available to be transplanted. Consequently, the available organs, at any point in time, must be rationed, which raises the question of determining the relevant factors to be considered in deciding who receives transplanted organs and who does not.

To harvest human organs that are necessary for human life, for example, hearts, lungs, or livers, and in order to be able to transplant them into the bodies of people who will not survive without such a transplant, is to harvest them from the bodies of people who are only recently deceased. However, a single kidney or bone marrow, for example, are usually harvested from the body of a donor who is alive and, presumably, well. In either case, in most countries, permission is required to be granted, legally and arguably also morally, in order for the harvesting to take place. Organ donor organizations exist to enlist as many citizens as possible, in countries in which organ harvesting has been legalized, to be organ donors so that, once such donors are deceased, health care professionals are authorized to harvest any of a number of viable organs from the deceased donor’s body. As is the case for any invasive medical procedure, permission is necessary for one to donate one’s kidney or bone marrow as well.

One of the most important moral issues concerning the recipients of human organs is the issue of the criteria that are used for the selection of human organ recipients. It should come as no surprise that one of the major factors to determine which prospective organ recipients are given priority on the waiting list is the age of the prospective recipient. With only rare exception, a young adult, as a prospective heart transplant recipient, will rank higher on the heart transplant waiting list than will an elderly adult, if the latter is deemed to even be eligible. Additional criteria that are used to determine both eligibility and ranking for organ transplantation include: 1) the extent to which the need for organ transplantation is urgent in order to save the prospective organ recipient’s life; and 2) the likelihood that, and the extent to which, the candidate for transplantation will benefit from the procedure, that is, its probability for success; but also, 3) the candidate’s history of deleterious health-related habits (for example, whether the candidate for a lung transplant has ever smoked cigarettes or other tobacco products, or currently does so); 4) the candidate’s ability to pay (either outright or through private or federally funded health insurance) for the procedure; and 5) the value of the candidate, by virtue of, for example, one’s occupation, to society (for example, a cancer biomedical researcher as compared to a high school custodian), and more. If the former two criteria do not seem to raise any moral concerns, each of the latter three, almost certainly, do.

While each of the first two of these criteria could be reflective of egalitarian principles of justice, according to which each candidate, as a person, is viewed as having equal value, each of the latter three of these criteria could be seen as beneficial to the best interests of society, that is, as promoting social utility. As such, egalitarian principles of justice do not necessarily promote what is in the best interests of society any more than social utility considerations necessarily promote what is in the best interests of the individual. However, the application of either of these two criteria is far less controversial than is the application of any one of the latter three criteria. It might be reasonable for people to disagree as to whether a candidate for a lung transplant, who smoked a pack of cigarettes each day for twenty years, is less deserving of such a transplant than another such candidate who has never smoked in one’s life. It might be reasonable for people to disagree as to whether a person who is otherwise a good candidate for an organ transplant should be rejected solely because this person cannot afford to pay for the procedure and has no access to health insurance. Finally, it might be reasonable for people to disagree as to whether a candidate for an organ transplant, who happens to be a cancer biomedical researcher, is any more deserving of such a transplant than is another medically qualified candidate, who happens to be a high school custodian.

Adding to the dissatisfaction that some people express concerning the rationing of human organs for transplantation, in America and in other countries, is the deference that is sometimes offered to people of social prominence. Publicly documented in America are cases in which, for example, a prominent former professional sports figure, who had cirrhosis of the liver due to decades of alcohol abuse, was offered a liver transplant despite being, at that time, far down on the waiting list, and a governor of an East Coast state, who was offered and received both a heart and a lung transplant, again despite being, at the time in question, far down on the waiting list due, at least in part, to his age and his health status. In fact, he died less than a year later.

Another moral issue that is endemic to the human organ transplant industry is the buying and selling of human organs for the purpose of transplantation. In some Central American and some South American countries as well as in some Mideast countries, for the past several decades, there has been a thriving illegal market for human organs. More recently, this practice has spread to some European countries and even to America, when financially impoverished people find themselves in need of money for their own sustenance. Typically, such individuals are promised the equivalent of thousands of dollars for a kidney or bone marrow but find themselves at the mercy of the organ dealer for payment after the fact. Worse, too many times, such medical procedures are performed in non-clinical environments and sometimes by non-clinically trained harvesters.

Raising additional moral concerns is the practice of what is sometimes referred to as the “farming” of human organs, that is, to conceive and to bring to fruition a newborn (or, in some cases, the harvesting of human organs or tissue can be done at the fetal stage) or to maintain on life support the body of someone who has been determined to be brain dead in order to be able to harvest an organ or bone marrow for transplantation. In the former case, questions arise concerning the moral propriety of bringing a child into the world for the express purpose of harvesting some of its body parts. Depending on which specific organs might be harvested, the death of this newborn might be inevitable. In the latter case, anyone, from an anencephalic newborn to a child or an adult of any age, who, as a result of either a non-traumatic or a traumatic event, has been declared to be in a state of unresponsive wakefulness (popularly referred to as a “permanent vegetative state”), that is, a patient whose state of consciousness, due to severe damage to the brain, is not indicative of actual awareness but, at best, only partial awareness or arousal, and whose condition has lasted for three to six months, in the case of a non-traumatic cause, or at least twelve months, in the case of a traumatic cause, might be maintained on life support for the express purpose of harvesting any of a variety of human organs. Any such case introduces questions concerning any of the following moral issues: Is it ever morally allowable to keep the body of an otherwise brain dead person alive for the sole purpose of harvesting some of its organs?; Even if brain dead, does such a practice violate any moral rights or interests of the individual in question? Even if the answer to these questions is in the negative, because this individual might be deemed to have the same physiological, and thereby moral, status as one who has died, does proper respect for the body of the dead dictate that this practice is morally improper?

Both the retail sale of human organs and the farming of human organs continue to raise the moral issue of whether, and to what extent, human organs should be treated as commodities to be bought and sold in the marketplace (legally or not) and grown for the express purpose of harvesting for transplantation. Twenty-first century stem cell research holds out the promise, incrementally and over time, to eventually be able to produce, in theory, any human body part from a single cell of one’s own body. To the extent that these prospects become realities, many of the moral issues that are raised by the procurement and the transplantation of human organs will become moot.

ii. The Question of Eligibility in Health Care

The question of who, in a given society, should be eligible to receive health care is one of the most important ethical issues concerning the provision of health care in the 21st century. This is because of the stark contrasts that exist concerning the distribution of health care when comparing America to other nations. America is the only one of the thirty or more wealthiest nations on the planet to continue to prohibit universal health care. Universal health care, by the nature of the case, leaves out of its financing equation private health care insurance providers. By contrast, in America, these private health care insurance providers are the primary drivers of the health care system, determining who is eligible for health care insurance coverage; what particular health care services they choose to finance, and for whom, including not only diagnostic procedures but also surgical and other invasive medical procedures; the lengths of stays in hospitals or other medical facilities, for both surgical and non-surgical patients; the cost of health insurance premiums as well as financial deductibles and co-payments to be paid by their customers; the fees for services for physicians, surgeons, and other health care professionals, and the percentage of such fees that they will pay; the particular prescription medications that they deem eligible for payment by themselves and how much, in co-payments, that their customers have to pay; and many additional factors that affect both the health and the finances of those who maintain such insurance coverage.

In fact, there is a direct relationship, due to the effects of this type of health care system, between the health care and the finances of all members of society (both those with health insurance and those without). Many members of American society with health insurance, by virtue of their own personal financial situations, face the choice, usually on a regular basis, as to whether they can afford to pay the financial deductibles and/or the co-payments for their own health care because their earned weekly wages, all too often, preclude them from making these payments in addition to paying for rent, food, and other necessities for their families and for themselves. Added to these issues is the fact that not all health insurance plans are the same concerning which services and procedures that they cover and which they do not, the practical effect of which is that many families with working parents do not have health insurance coverage for many important and significant health care services and procedures, or even prescription medications. Worse, a large percentage of wage earners, and some salaried employees, cannot, reasonably, afford to pay the costs of health insurance premiums, and so, have no health insurance coverage at all. The practical effect of this is that in addition to not being able to afford, out of pocket, health care services or procedures that serve to maintain one’s reasonably good health status, these individuals cannot afford to seek medical attention when they experience health care symptoms even of a dire nature.

All of these facts concerning the health care system in America as compared to the health care systems in virtually every other reasonably wealthy nation in the world raise the following questions of a moral nature. Does each and every citizen of any society have a moral right to health care? If so, does the government of any society have a moral obligation to provide each and every one of its citizens with health care? These questions, by their very nature, raise the issue of the extent to which the ethical principle of justice can be realized in any given society. At the societal level, the ethical principle of justice is applicable, fundamentally, to the ways in which goods and services as well as rights, liberties, opportunities for social and economic advancement, duties, responsibilities, and many other entities (both tangible and intangible) are distributed to citizens. The application of the ethical principle of justice to these questions concerning health care provides a benchmark for the determination of which types of health care systems are more, or less, just than others.

While any of the methods of moral decision-making, as delineated above, could be applied in fruitful ways to such questions, it might be more instructive to apply two public policy perspectives: libertarianism and egalitarianism. Those politicians and public policy makers who are responsible, over many decades, for the health care system in America, have, for the most part, done so based on libertarian principles of justice, while those politicians and public policy makers who are responsible, again, over many decades, for the health care systems in those countries with universal health care coverage, have, by and large, done so based on egalitarian principles of justice.

According to libertarian principles of justice, citizens might or might not have any kind of right to health care, but even if they do, it should not result in the placing of financial burdens on wealthier citizens to fund, in part or in whole, the health care of their less financially well-off counterparts. Rather, health care, like food, clothing, the cost of shelter, and the costs of all other goods and services available in society, should be distributed by the dictates of a free-market economic system. Those who are wealthier, and who are able to buy more expensive goods and services of superior quality, will also be able to afford to buy not only health care services and procedures themselves, but also a superior quality of such health care commodities. Those who are less wealthy, and who are able to buy less expensive goods and services of comparatively inferior quality, will be able to afford health care services and procedures, but only of a comparatively inferior quality. Finally, those who are financially impoverished will not be able to afford health care services or procedures at all. Under the public policy dictates of this type of health care system, the ethical principle of the autonomy of citizens to make their own choices, as citizens in society, takes precedence over the ethical principle of beneficence.

According to egalitarian principles of justice, each citizen in society has an equal right to health care services and procedures because each citizen in society has equal value as a person. Because the status of one’s health is foundational for one to even be able to enjoy a reasonably good quality of life (and all that that entails), the government is obligated to provide each and every one of its citizens with access to health care services and procedures. Unlike most of the goods and services the distribution of which is dictated by a free-market economic system, health care is essential to the well-being of every citizen. Of course, the politicians and public policy makers, in accordance with this type of health care system, would have to adjudicate the question of whether all health care services and procedures would be available to all of the members of society, in equal measure, or the ways in which, and the degrees to which, such services and procedures would be made available to the members of society. Under the public policy dictates of this type of health care system, the ethical principle of beneficence supersedes, in importance, the ethical principle of the autonomy of its citizens to make their own choices.

In the final analysis, the ways in which, and the degrees to which, particular health care services and procedures are distributed among the citizens of a given society depend on the dictates of the principles of justice not only as they are applied to the society’s economic system but also as they are applied to the society’s governmental system.

f. Health Care Organization Ethics Committees

The Joint Commission is the comprehensive accrediting agency for health care programs and organizations, of all types, throughout America, and has, for some time, mandated the inclusion of ethics committees as an accreditation requirement. The purpose of any health care organization ethics committee is to develop, to engage in an on-going process of the review of, and to ensure the proper application of the medical ethics policies of the health care organization in question. Such policies would normally include such significant issues in health care ethics as informed consent, confidentiality, euthanasia, assisted suicide, the withholding and withdrawing of medical treatment, the harvesting and transplantation of human organs, and many others depending on the specific type of health care organization. While there is a wide latitude concerning the membership composition of health care ethics committees, typically, the following professions are represented: physicians, nurses, social workers, senior administrators, risk managers, chaplains, and ethicists, in addition to lay people from the local community, among others.

Functions of a health care ethics committee include the following: to become informed about, and to maintain a credible level of awareness of, significant issues in health care ethics, generally, and their relationships to the needs of both the patients and the health care professionals who are associated with the health care facility in question; to educate, on an on-going basis, the health care professionals of the facility in question, in addition to the members of the ethics committee, on significant issues in health care ethics as well as the ethics committee’s policies concerning such issues; and to be responsible for the particular cases of the facility’s patients that warrant either a review by, or a consultation with, the ethics committee. The health care ethics committee is, usually, the final authority on ethics policy concerning medical issues, subject to approval by the facility’s Board of Trustees.

5. Conclusion

Health care ethics is a multi-faceted and fundamentally important issue for the citizens of any society because the provision of health care is essential to the well-being of each person, and the ways in which people are treated, concerning their health care, bears importantly on their health status. The many moral issues that arise out of the provision of health care—from those that are inherent in the relationship between the health care professional and the patient to those associated with abortion and euthanasia, from those to be encountered in biomedical or behavioral human subject research to those that have come about as a result of reproductive and genetic knowledge and technologies, and from those concerning the harvesting and transplantation of human organs to those that stem from public policy decisions as determinative of the allocation of health care services and procedures—are perennial issues. To attempt to clarify these moral issues by use of the philosophical analysis of the language and the concepts that underlie them is, at least in theory, to provide a framework in accordance with which to make better quality decisions concerning them.

6. References and Further Reading

  • Anderson, E. S. (1990) “Is Women’s Labor a Commodity,” in Philosophy and Public Affairs, 19: Winter, pp. 71-92.
  • Aristotle (1985) Nicomachean Ethics, trans. by Terence Irwin, Hackett Publishing Co.
  • Beauchamp, T. L. and Childress, J. F. (2009) Principles of Biomedical Ethics, 6th ed., New York: Oxford University Press.
  • Beauchamp, T. L., Walters, L., Kahn, J. P., and Mastroianni, A. C. (2014) Contemporary Issues in Bioethics, 8th ed., Boston: Cengage.
  • Boylan, M. (2004) A Just Society, Lanham, Maryland: Oxford: Rowman and Littlefield.
  • Boylan, M. (2012) “Health as Self-Fulfillment,” in the Philosophy and Medicine Newsletter, 12:4. (Reprinted in Boylan, M. (2014) Medical Ethics, 2nd ed., Malden, Massachusetts: Wiley-Blackwell, pp. 44-57.)
  • Boylan, M. (2014) Medical Ethics, 2nd ed., Malden, Massachusetts: Wiley-Blackwell.
  • Brandt, A. M. (1978) “Racism and Research: The Case of the Tuskegee Syphilis Study,” in the Hastings Center Report, 8:6, pp. 21-29.
  • Brennan, T. (2007) “Markets in Health Care: The Case of Renal Transplantation,” in the Journal of Law, Medicine & Ethics, 35:2, pp. 249-255.
  • Brock, D. W. (1998) “Cloning Human Beings: An Assessment of the Ethical Issues Pro and Con,” in Clones and Clones: Facts and Fantasies About Human Cloning, edited by Nussbaum, M. C. and Sunstein, C. R., W. W. Norton & Co.
  • Callahan, D. (1989) “Killing and Allowing to Die,” in the Hastings Center Report, 19 (Special Supplement), pp. 5-6.
  • Chadwick, R. F. (1989) “The Market for Bodily Parts: Kant and Duties to Oneself,” in the Journal of Applied Philosophy, 6:2, pp. 129-140.
  • Cohen, C. B. (1996) “‘Give Me Children or I Shall Die!’ New Reproductive Technologies and Harm to Children,” in the Hastings Center Report, 26:2, pp. 19-27.
  • Gert, B. and Clouser, K. D. (1990) “A Critique of Principlism,” in The Journal of Medicine and Philosophy, 15:2, pp. 219-236.
  • Glannon, W. (2001) “Genetic Enhancement,” in Genes and Future People: Philosophical Issues in Human Genetics, Glannon, W., Westview Press, pp. 94-101.
  • Harris, J. (1993) “Is Gene Therapy a Form of Eugenics?” in Bioethics, 7:2/3, pp. 178-187.
  • Held, V. (2006) The Ethics of Care, New York: Oxford University Press.
  • Holmes, H. B. and Purdy, L. M. (1992) Feminist Perspectives in Medical Ethics, Bloomington: Indiana University Press.
  • Jonsen, A. R. and Toulmin, S. (1988) The Abuse of Casuistry: A History of Moral Reasoning, Berkeley: University of California Press.
  • Kant, I. (1989) Foundations of the Metaphysics of Morals, edited and translated by Lewis White Beck, Library of Liberal Arts: Pearson.
  • Kevorkian, J. (1991) Prescription—Medicine: The Goodness of Planned Death, Prometheus Books.
  • Kuhse, H. (1997) Caring: Nurses, Women and Ethics, Oxford: Blackwell.
  • Kuhse, H., Schuklenk, U., and Singer, P. (2015) Bioethics: An Anthology, 3rd ed., Malden, Massachusetts: Wiley Blackwell.
  • MacKay, D. and Danis, M. (2016) “Federalism and Responsibility for Health Care,” in Public Affairs Quarterly, 30:1, pp. 1-29.
  • Marquis, D. (1989) “Why Abortion is Immoral,” in the Journal of Philosophy, LXXXVI:4, 183-202.
  • Mill, J. S. (1861) Utilitarianism, in Collected Works of John Stuart Mill. Edited by J. M. Robson, Vol. X, Toronto: University of Toronto Press, 1969.
  • National Academy of Sciences (2002) Committee on Science, Engineering, and Public Policy, Scientific and Medical Aspects of Human Reproductive Cloning, Washington, D. C.: National Academy Press.
  • Nesbitt, W. (1995) “Is Killing No Worse than Letting Die?” in the Journal of Applied Philosophy, 12:1, pp. 101-105.
  • Noonan, J. T. (1968) “Deciding Who Is Human,” in the American Journal of Jurisprudence,  13:1, pp. 134-140.
  • Noonan, J. T. (1970) “An Almost Absolute Value in History,” in The Morality of Abortion: Legal and Historical Perspectives, John T. Noonan, Cambridge: Harvard University Press, pp. 51-59.
  • Purdy, L. M. (1989) “Surrogate Mothering: Exploitation or Empowerment?” in Bioethics, 3:1, pp. 18-34.
  • Rachels, J. (1975) “Active and Passive Euthanasia,” in the New England Journal of Medicine 292, pp. 78-80.
  • Ram-Tiktin, E. (2012) “The Right to Health Care as a Right to Basic Human Functional Capabilities,” in Ethical Theory and Moral Practice, 15:3, pp. 337-351.
  • Robertson, J. (1994) “The Presumptive Primacy of Procreative Liberty,” in Children of Choice: Freedom and the New Reproductive Technologies, Princeton: Princeton University Press, pp. 22-42.
  • Robertson, J. A. (1983) “Procreative Liberty and the Control of Conception, Pregnancy, and Childbirth,” in the University of Virginia Law Review, 69, pp. 405-464.
  • Savulescu, J. (1995) “Rational Non-Interventional Paternalism: Why Doctors Ought to Make Judgments of What Is Best for Their Patients,” in the Journal of Medical Ethics, 21, 327-331. (Reprinted in Medical Ethics, 2nd ed. (2014), ed. by Michael Boylan, Malden, Massachusetts: Wiley-Blackwell, pp. 83-90.)
  • Savulescu, J. (2001) “Procreative Beneficence: Why We Should Select the Best Children,” in Bioethics, 15:5/6, pp. 413-426.
  • Savulescu, J. and Momeyer, R. W. (1997) “Should Informed Consent Be Based on Rational Beliefs?” in the Journal of Medical Ethics, 23, pp. 282-288. (Reprinted in Medical Ethics, 2nd ed. (2014), ed. by Michael Boylan, Malden, Massachusetts: Wiley-Blackwell, 104-115.)
  • Shaw, D. (2009) “Euthanasia and Eudaimonia,” in the Journal of Medical Ethics, 35:9, 530-533.
  • Sherwin, S. (1992) No Longer Patient: Feminist Ethics and Health Care, Philadelphia: Temple University Press.
  • Sherwin, S. (1994) “Women in Clinical Studies: A Feminist View,” in the Cambridge Quarterly of Healthcare Ethics, 3:4, pp. 533-539.
  • Silvers, A. (2012) “Too Old for the Good of Health?” in the Philosophy and Medicine Newsletter, 12:4. (Reprinted in Boylan, M. (2014) Medical Ethics, 2nd ed.,Malden, Massachusetts: Wiley-Blackwell, pp. 30-43.)
  • Skloot, R. (2010) The Immortal Life of Henrietta Lacks, New York: Crown/Random House.
  • Steinbock, B., London, A. J., and Arras, J. (2013) Ethical Issues in Modern Medicine: Contemporary Readings in Bioethics, 8th ed., Columbus, Ohio: McGraw-Hill.
  • Stoller, S. (2008) “Why We Are Not Morally Responsible to Select the Best Children: A Response to Savulescu,” in Bioethics, 22:7, pp. 364-369.
  • Thomson, J. J. (1971) “A Defense of Abortion,” in Philosophy and Public Affairs, 1:1, 47-66.
  • Tong, R. (1997) Feminist Approaches to Bioethics: Theoretical Reflections and Practical Applications, Boulder: Westview Press.
  • Tong, R. (2002) “Love’s Labor in the Health Care System: Working Toward Gender Equity,” Hypatia, 17:3, pp. 200-213.
  • Tong, R. (2012) “Ethics, Infertility, and Public Health: Balancing Public Good and Private Choice,” in the Newsletter on Philosophy and Medicine, 11:2, pp. 12-17. (Reprinted in Boylan, M. (2014) Medical Ethics, second ed., Malden, Massachusetts: Wiley-Blackwell, p.13-30.)
  • Wachbroit, R. (1996) “Disowning Knowledge: Issues in Genetic Testing,” in Report from the Institute for Philosophy and Public Policy, 16:3/4, pp. 14-18.
  • Warren, M. A. (1973) “On the Moral and Legal Status of Abortion,” in The Monist, 57:1, 43-61.
  • Warren, M. A. (1988) “IVF and Women’s Interests: An Analysis of Feminist Concerns,” in Bioethics, 2:1, pp. 37-57.
  • Warren, V. L. (1992) “Feminist Directions in Medical Ethics,” in the HEC Forum, 4:1, pp. 73-    87.
  • World Health Organization, Preamble to the Constitution of the World Health Organization, New York, June 19-July 22, 1946 (New York: Adopted by the International Health Conference, and signed on July 22, 1946.)

 

Author Information

Stephen C. Taylor
Email: staylor@desu.edu
Delaware State University
U. S. A.

Science and Ideology

This article illustrates some of the relationships between science and ideologies. It discusses how science has been enlisted to support particular ideologies and how ideologies have influenced the processes and interpretations of scientific inquiry.

An example from the biological sciences illustrates this. In the early 20th century, evolutionary theory was used to support socialism and laissez-faire capitalism. Those two competing ideologies were justified by appeal to biological claims about the nature of evolution.

Those justifications may seem puzzling. If science claims to generate only a limited set of facts about the world—say, the mechanisms of biological diversification—it is unclear how they could inform anything so far removed as economic theory. Part of the answer is that the process of interpreting and applying scientific theories can generate divergent results. Despite science’s capacities to render some exceedingly clear and well-verified central cases, its broader uses can become intertwined with separate knowledge claims, values, and ideologies. Thus, the apparently clear deliverances of natural sciences have been leveraged to endorse competing views.

Rightly or wrongly, this leveraging has long been part of the aims and practice of scientists. Many of the Early Modern progenitors of natural science hoped that science would apply to large swaths of human life. They believed that science could inform and improve politics, religion, education, the humanities, and more. One fictional version of this ideal, from Francis Bacon in the 17th century, imagined scientists as the political elites, ruling because they are best equipped to shape society. Such hopes live on today.

It is not only in its applications that science can become ideological; ideologies also can be part of the formation of sciences. If natural sciences are not hermetically sealed off from society, but instead are permeable to social values, power relations, or dominant norms of an era, then it is possible for science to reflect the ideologies of its practitioners. This can have a particularly pernicious effect when the ideologies that make their way into the science are then claimed to be results derived from the science. Those ideologies, now “naturalized,” have sometimes been granted added credibility because of their supposedly scientific derivation.

Not all sciences seem equally susceptible to ideological influence or appropriation. Ideologies seem to have closer connections to those sciences investigating topics nearer to human concerns. Sciences that claim to bear upon immigration restrictions, government, or human sexuality find wider audiences and wider disputes than scientific conclusions limited to barnacle morphology or quantum gravity.

The potential for science to become entwined with ideology does not necessarily undermine scientific claims or detract from science’s epistemic and cultural value. It hardly makes science trivial, or just one view among others. Science must be used well and taken seriously in order to solve real-world challenges. Part of taking science seriously involves judicious analysis of how ideologies might influence scientific processes and applications.

The topic is vast, and this article confines itself to some historical cases that exemplify significant interactions between science and ideologies.

Table of Contents

  1. Terminology
  2. Science and Political Economy
  3. Science and Race
  4. Science and Gender
  5. Science and Religion
  6. Science as Ideology: Scientism
  7. Conclusion
  8. References and Further Reading

1. Terminology

First, a brief note about definitions.  What exactly is meant by “science” and by “ideology”?  Much has been written attempting to define these concepts, but we only need the broad outlines of such attempts before moving on.

The word “science” derives from the Latin scientia, or knowledge. It has historically been closely associated with philosophy. At least since the Renaissance, the term has acquired connotations of theoretical, organized, and experiential knowledge.

In the 17th century, a constellation of practices, ideas and institutions among natural philosophers contributed to what most historians recognize as the advent of modern science. Galileo Galilei, Rene Descartes, Francis Bacon, Robert Boyle, and Isaac Newton (all of whom considered themselves philosophers) wrote texts that subsequent practitioners lifted up as exemplary of the “new philosophy.” While there was no universal agreement on exactly what this new philosophy consisted of, some of the most salient elements included the rejection of Aristotelian forms and final causes; the attempt to account for most natural phenomena in terms of efficient causes operating according to laws of nature; the identification and quantification of objective “primary qualities” such as mass and velocity; and the introduction of experimental practices using the controlled operation of idealized or contrived events as evidence for nature’s operation.

Science encompasses two distinctive strands, including both a body of knowledge and a coordinated set of instrumental activities that generate technological or engineering solutions. The former continues the legacy of natural philosophy through its aim to understand, explain, and predict the world. The latter strand has more pragmatic concerns to build tools and solve problems. Perhaps unsurprisingly, philosophers have paid most attention to the first, natural philosophical, strand of science.

In the mid-20th century, philosophers launched a vigorous campaign to correctly characterize science and thus distinguish it from illegitimate forms of knowledge or pseudoscience. If the scientific method could be correctly identified, they supposed, then the right method for knowledge generation could be secured, and there would be a better way to jettison dubious, nonscientific, or merely ideological claims. For example, Karl Popper was famously keen to exclude Marxist historiography and Freudian psychoanalysis from the province of science. Along with Popper, Imre Lakatos and others contributed to a sophisticated body of literature on scientific method, attempting to square the idea of characteristic and rational rules of science with the historical record of dynamic, changing scientific theories and practices. Paul Feyerabend, by contrast, urged abandoning the search for rules of science altogether; he argued that, since science is a creative and evolving enterprise, there is no specific method it ever did, or should, follow.

The campaign to distinguish science from pseudoscience has now largely subsided with no clear resolution. Some philosophers see scientificity as a matter of degree that can be instantiated to a greater or lesser extent according to how systematic the study may be. Nonetheless, a single definition of science remains elusive. The diversity of activities and methods used across the natural sciences makes it difficult to find anything that neatly separates sciences from other human activities not typically considered scientific, like auto mechanical work. As one philosopher put it, “Why should there be the method of science? There is not just one way to build a house, or even to grow tomatoes. We should not expect something as motley as the growth of knowledge to be strapped to one methodology” (Hacking 1983).

Much like science, “ideology” is notoriously difficult to pin down as a single, determinate concept. The term was originally proposed around the year 1800 to be, quite literally, a science of ideas: a way to rigorously study humans’ ideas as part of natural history. The term’s creator, Destutt de Tracy, even imagined this new science as a branch of zoology.

But the word has since changed its meaning and today frequently carries a negative connotation. In informal discourse, “being ideological” is often a pejorative label used to accuse someone of being blinkered to reality by a particular set of beliefs. This pejorative sense of ideology comes largely from classical social theorists, especially Karl Marx. For Marx, to be in the grip of a false ideology was to naively adopt ruling class ideas about art, religion, ethics, or politics, which are actually explained by that society’s economic structure. Those ideologies, Marx believed, generated a false consciousness about one’s own world and diverted one’s attention from true sources of oppression (Marx and Engels 1938). While ideologies claim to describe the way things are, Marx claimed that in reality they function to defend political structures underpinning class hierarchies. Marx diagnosed and critiqued such ideologies, hoping thereby to liberate individuals from self-oppression and to bring about social reforms. In this tradition, ideology was often seen as antithetical to science. This conceptual contrast between science and ideology has largely been passed down to us today, for example, when science is imagined to be quintessentially nonideological.

Following Marx, subsequent theorists extended views of ideology and why it might be harmful. Political philosopher Hannah Arendt criticized ideology for the way it short-circuits substantive political debate. Ideologies posit basic tenets or first principles, such as racial purity, class struggle, or free markets, from which other ideas automatically follow. According to Arendt, ideologies have a pernicious role in replacing genuine ethical debate with their own abstract and internal logic. Promising certainty, ideologies run roughshod over tradition, concrete historical particulars, and the difficult business of moral deliberation (Arendt 1973).

This article does not adhere solely to theoretical frameworks that criticize ideology, and so this article treats “ideology” in its broader and more neutral sense, as a description of the organizing beliefs of a population. This second, broader use is in accord with the practices of empirical anthropology, which might seek to describe the organizing beliefs of a foreign culture. When conceived of in this descriptive sense, ideologies may be understood as necessary or positive for many political purposes. Ideologies in this sense are merely ways of interpreting or “mapping” our political and social environments (Freeden 2003).

Some important features are common to both the pejorative and more neutral senses of ideology. First, ideologies are beliefs that legitimate or stabilize social power structures. Broadly speaking, ideologies relate to politics because they have a social function, and as such they can engender a sense of group identity or motivate the need for action. Second, ideologies are not always transparent to those who hold them. It is often easier to recognize ideology in others than in oneself. Third, ideologies involve beliefs that are closer to the center of one’s web of belief. That is to say, they are not easily acquired and released, because they play a structural role in how we see things, what is construed as evidence, and sometimes even personal identity. Fourth, there is typically a complex admixture of descriptive and prescriptive elements to ideologies: Their defense would appeal to the way things are and how things ought to be (Seliger 1976).

We need not dwell on these attempts to define such complex terms as science and ideology. It is worth noting, however, that particular definitions of the terms would render an analysis of science and ideology much less significant—or even meaningless. If science were just descriptive and ideology just prescriptive, then perhaps they would be two radically different sorts of things, and the two should never meet, since, according to some philosophers working in the tradition of David Hume, an is cannot generate an ought. On this view, they could not overlap without some improper transgression of one into the rightful territory of the other. However, ideologies are not just wishful desires; they are informed by some facts and make claims about the way the world is. Conversely, some philosophers argue that science is not accurately characterized as value-free, purely descriptive facts, but instead that science is laden with values (Douglas 2009).

A second set of definitions that might render the topic of science and ideology less meaningful would be if science were essentially or only ideological in nature, so that the two terms wholly collapse into one another. If science were just politics by other means,

then perhaps “science” would not add anything new to an investigation of “science and ideology.” But this collapse can be resisted. While we can fruitfully analyze the generation and transmission of scientific knowledge in its purely social and anthropological dimensions—that is, without reference to truth or to any unconditioned external reality—this does not make science nothing but ideology. Ignoring the distinctiveness of the world from human cognition risks an untenable relativism.

Accordingly, we may rest content with broad and common notions of science and ideology, recognizing that they label many different things and that their boundaries are not precise. This need not hinder investigation. Prototypically at least, sciences are not just ideologies. There may be overlap in the real-world history of science, but the terms regularly and usefully label distinct notions.

2. Science and Political Economy

Many well-known discussions of ideological influence on science illustrate how ideology can warp science. One notorious episode frequently construed as an ideological distortion of science is from mid-20th century Soviet biology, when the agricultural research of Trofim Lysenko was at the center of a broader effort to shape a uniquely Soviet biology (Roll-Hansen 2005; Graham 2016). Lysenko and others claimed that grain growth and heredity could be significantly influenced by environmental alterations such as treating the seeds with cold and moisture, and that such alterations could lead to improved crop yields and the reformulation of genetics writ large. The claims about temperature effects are true, while the latter claims are contested and more problematic. The ideological forces contributing to the rise of Lysenko’s science were at least twofold: First was a Soviet concern that natural science should address practical problems and contribute to the common good of the people—the connection with agriculture here was obvious in this period of scarcity and famine. Second was the Marxist precept that organisms are shaped primarily by their environments rather than determined by innate biological traits. Some Soviet scientists and politicians of the period understood Western genetics to be corrupted by capitalist notions of competition, innateness, and individualism, while they saw Western science more generally as unduly prioritizing pure theoretical science disconnected from the needs of the masses. While there was some merit in such critiques, Lysenkoist science was a failure on its own terms: Crop yields were not radically improved. Moreover, and perhaps most importantly, Stalin’s explicit approval of Lysenkoism as officially Soviet, and the ensuing eradication of a critical research community—including the imprisonment of dissenting scientists—contributed to the precipitous decline of Soviet genetics in this period. Political power structures that hinder open and critical debate damage science.

Ideological influence is not only exerted upon scientific research, but on the dissemination of that research as well. Popular understanding of science is crucial for public policy formation, and that understanding can be shaped by any number of forces. For example, multiple independent lines of evidence established a link between cigarette smoking and lung cancer in the 1940s and 1950s, yet the tobacco industry, aware of these health effects, lobbied think tanks, academics, and media executives to disseminate a message that this science was inconclusive. The industry’s efforts were immensely successful, as many Americans, including medical doctors, reported believing that science had no conclusive evidence for such a link for decades afterwards (Michaels 2008; Brandt 2012; Proctor 2012). The same tactics of purposefully manufacturing scientific uncertainty have been deployed to spread ignorance about scientific knowledge of acid rain, ozone hole depletion, and greenhouse gas emissions (Oreskes and Conway 2010). Behind this campaign of manufactured doubt has been a political concern that some science could be used to support environmental or public health regulations, thus threatening the unregulated markets that some groups find central to political economics.

While ideologies can distort science and its popular understanding, it is important to point out that many of the classic studies of science and ideology investigated which ideologies provided the best contexts for scientific advance (Bernal 1939, Merton 1942). An important thesis concerned whether Western-style liberal democracies could be the best political arrangements for the production of quality science. One idea here was that good science may require a kind of openness to critique that is essentially a political ideal, and that such openness also underpins liberal democracies. One contrast, during this time period, was the Soviet Union’s communism, which excelled in centralized planning of science. State direction of scientific activities contributed to the Soviet Union’s Cold War successes, such as Sputnik, and such strategies were also sometimes used by the US, for example in its Manhattan Project. Political ideologies shape science through funding, planning, institutionalization, and their political ethos.

Much discussion has also been generated by the question of which political or economic ideologies might be supported by particular scientific theories. To take just one example, the theory of evolution by natural selection has been used to legitimate multiple and incompatible political ideologies, from conservative politics and laissez faire capitalism to socialism.

Biology has often been used to reinforce essentialist, individualist, and conservative doctrines. If people are who they are because of innate traits, and society is the way it is because of those traits too, then it seems as if nature itself underwrites the political order. On this view, class structure has its particular form because the upper classes have the right stuff in their blood. Attempts to change the political order, then, would mean not just fighting a status quo, but fighting nature itself. Such ideas, sometimes called “biological determinism,” minimize the influence of environments, history, and culture in shaping societies or individuals and are typically used to oppose efforts to shape society through education, welfare programs, or other promotions of social mobility.

Biology has also been used to bolster a specifically capitalist ideology that places competition in the center of its worldview. The idea here is that just as organisms’ competition for scarce resources eventually generates evolutionary change by weeding out the unfit, so also individual competition should yield social and economic progress. One source for this view in the 19th century was scientific naturalist Herbert Spencer, the pre-Darwinian popularizer of evolution who coined the term “survival of the fittest.” Spencer’s view of evolution was all-encompassing and ardently progressive, positing competition at the center of a process yielding a more harmonious “social organism.” Spencer imagined a biological process responsible for progress in social, political, economic, and even racial dimensions. While Spencer did not intend to justify corporate or state rapaciousness, his popular evolutionary narrative was adopted by others to justify laissez faire capitalism. Upon studying Spencer, American industrialist Andrew Carnegie testified, “I remember that light came as in a flood and all was clear… I had found the truth of evolution. ‘All is well since all grows better’ became my motto, my true source of comfort” (Carnegie 1920). Such ideas apparently meshed with Carnegie’s objection to government influence in commerce, his repudiation of workers’ unions, and his insistence that the concentration of capital by industrialists like himself was essential for social progress. Capitalists were confident nature was on their side.

Socialists were too. Many socialists seized on the materialist implications of evolution—that biological history could be explained in terms of natural laws—to support their view that social history was likewise governed by laws. Some said that Marx had anticipated Darwin by developing an evolutionary picture of social change. The philosopher Georgi Plekhanov went further, practically equating the two theories: “Marxism is Darwinism in its application to social science” (1956). Friedrich Engels thought that evolutionary theory provided evidence for the dialectical nature of historical change, which he argued was key to understanding social and natural history alike. Others found evolution as evidence for socialism only when purged of its problematic framing as essentially competitive. The Russian scientist and philosopher Peter Kropotkin emphasized the centrality of cooperation in biological evolution; his (1902) study of mutual aid argued that a variety of mutualistic and altruistic behaviors had been largely underrepresented in contemporary biology in favor of the more gladiatorial frameworks deployed by British naturalists.  For Kropotkin, the extent of cooperative behaviors in nature bore lessons for social organization writ large: While the “unsociable species” were “doomed to decay,” the more sociable ones were invariably “more prosperous,” open to “further progress,” “higher intellectual development,” and “further progressive evolution.” In turn, Kropotkin advocated a distinctive version of small-scale communism based on voluntary cooperative living.

Indeed, many have found nature replete with lessons about social order, and nature’s authority has been claimed by reactionaries and revolutionaries alike. Darwinism has been grafted onto political economics by various institutions and individuals to serve distinct ends. These combinations of Darwinism and political economics were then no longer straightforwardly scientific theories, but malleable cultural resources that could serve various interests.

Darwin’s evolutionary theory, postulating common descent and natural selection as a mechanism of change, has been accepted in broad outline by contemporary biologists. Moreover, there is a widespread expectation that evolution should inform and enrich many other areas of science and human life. How to use that theory, and what it means for our understanding of economics or politics, remain topics of continued debate. In particular, there is considerable ambiguity in the scope of evolutionary generalizations. Questions remain as to what phenomena evolution applies to, what it does or does not explain, and whether certain forms of social organization are more natural, and therefore preferable, to others. Such questions are not settled by the biological data that were so influential in the theory’s adoption, and they remain contested today.

3. Science and Race

Racist societies have generated racist sciences. If, as was hinted above, science is sometimes permeable to social values, then it makes sense that racist ideologies could make their way into the questions, methods, and analyses of some scientists. Decades of diverse research programs were devoted to establishing the natural basis of European racial supremacy. In the 20th century, eugenics continued the legacy of racist science in its widespread adoption throughout Europe and North America.

Eighteenth and 19th century anthropologists regularly described non-European peoples and cultures as “savage,” “primitive,” and “uncivilized.” Their subjects were typically described in opposition to the “advanced” cultures that anthropologists imagined themselves part of. Early anthropology was closely linked with the colonial projects of Europe, and the notion that foreign peoples were incompetent to look after themselves fit well with the drive to colonize foreign places to extract their resources, bodies, and labor. This period gave rise to the notion that races are biological categories. While theorists continue to debate whether there are viable biological notions of race—for example, as lineages whose geographical isolation is responsible for superficial phenotypic differences (Kitcher 2007)—many contemporary anthropologists, biologists, and philosophers reject the notion that folk categories of race are real biological divisions (Baker et al. 2017; Gannett 2004; Witherspoon et al. 2007; Yudell et al. 2016; Winther and Kaplan 2013).

But if races were distinct biological populations, as many scientists of the 19th century believed, then one scientific task was to classify these distinct groups. An important question among these biologists was whether races descended from a single source—assumed to be Adam and Eve, according to their Christian beliefs—or from multiple, separate sources, perhaps from different places or different Adams. These hypotheses were labeled monogenism and polygenism. Polygenists found an important spokesperson in Harvard biologist Louis Agassiz. Quantitative evidence for Agassiz’s polygenism came from Samuel George Morton’s renowned biometrical measurements of cranial volumes. In this period, skull sizes were believed to be indicators of mental capacity, and Morton’s studies “found” just the answers he expected to find: Europeans had the largest cranial volumes. Such studies were later discovered to be badly compromised by selection bias, but not before they had a significant impact on social policies that disenfranchised non-Europeans. Agassiz, one of the most influential American biologists of the 19th century, used those studies to argue for polygenism, the innate inferiority of “colored races,” and by extension, for separate educational regimes for different ethnicities (Gould 1996).

Darwin hoped that the monogenism inherent to his own theory—this time evolutionary in character rather than creationist—would have remedial social effects. Because evolution posited common descent, emphasizing humans’ shared history, Darwin hoped it would diminish the scientific arguments for racial hierarchy, and therefore contribute to the demise of the slave trade that he abhorred (Desmond and Moore 2009). However, many scientists found that their racism was compatible with multiple scientific theories, including Darwin’s: If we all evolved from a common ancestor, they reasoned, then some of us are more evolved than others. Because evolutionary theory was widely understood as a kind of progressive force molding better and better organisms, it was sometimes used to separate the putatively advanced from less advanced humans, and such scientific hypotheses aligned with common social hierarchies of the time.

Some of the racist proclivities visible in the biometrical programs of cranial measurement persisted into later strands of psychology, including intelligence measurement. Intelligence tests, originally designed by Alfred Binet for diagnostic and remedial purposes, were later transformed by Henry Goddard, who interpreted the tests as indicators of an innate general intelligence. Goddard and many others in his wake used such tests to articulate the social “menace” posed by those of low intelligence, and also to argue for immigration restrictions. His IQ tests were administered to newly arrived immigrants at Ellis Island, where Goddard claimed they showed that about 80% of Jews, Hungarians, and Italians—groups that were often considered inferior races—were officially “feeble-minded.” Goddard concluded, “[T]he immigration of recent years is of a decidedly different character from the early immigration… We are now getting the poorest of each race” (cited in Gould 1996).

Underpinning many lines of such nativist and racist science was a belief in hereditarianism, the doctrine that heredity, rather than environmental influences, decisively shapes or even determines human character traits, including personality and intelligence. For example, many scientists believed that traits like criminality could be passed on from one generation to the next. This hereditarian doctrine, when combined with the modernist political will for social engineering and optimism that the nascent science of genetics would discover discrete underpinnings of traits like criminality, contributed to the rise of eugenics in the early 20th century.

Darwin’s cousin Francis Galton coined the term eugenics, meaning “good breeding,” in 1883 to describe the application of hereditary science to human improvement. The idea was to improve society through more selective reproduction; it could be manifest in positive eugenics, encouraging reproduction among the “right” kind of people; or negative eugenics, discouraging or prohibiting reproduction among the “wrong” kind of people. It was implemented around the world but especially in Europe and North America; records show that 20,000 people were sterilized against their wills in the state of California alone. While eugenics reinforced multiple social prejudices against the disabled, the poor, and the “feeble-minded,” racism was a central element of its broad agenda.

Eugenics garnered widespread support from many corners of public life, including conservatives, progressives, scientists, and the religious. As just one measure of its broad scientific backing, consider that no fewer than five presidents of the American Association for the Advancement of Science were members of the advisory board for the American Eugenics Society. Eugenics flourished in different forms of governments, including socialist, liberal democratic, and authoritarian (Mottier 2010). Galton hoped that eugenics might one day obtain the mass social appeal of “orthodox religion,” and this hope was not far off: Eugenics enjoyed broad support among Protestants, and there was even a sermon competition for best sermons supporting eugenics in America (Rosen 2004). While there was disagreement about how to implement eugenics, there were few institutional voices questioning whether eugenics should be implemented until the 1930s, when the Catholic Church voiced its opposition. British Catholic and public intellectual G. K. Chesterton (1922) was a noteworthy exception to the broad consensus favoring eugenics.

Madison Grant’s (1916) Passing of the Great Race extended hereditarian thinking with explanations of how climate molded Nordic superiority, leading to an advanced race of humans. Grant combined this notion of Nordic supremacy with the leitmotif of white fragility. Whiteness, in this tradition, was fashioned as dominant and innately superior, but at the same time fragile and threatened with imminent demise. Grant was an American amateur anthropologist, but he found a wide audience, and a personal note of praise was mailed to him from none other an overseas admirer than Adolf Hitler, who called the book “my Bible.”

Hitler’s Third Reich was largely founded upon a biomedical ideology of “racial hygiene” (Proctor 1988). The regime is most infamous for its anti-Semitism, but its targeted killings began with the disabled, Roma people, homosexuals, and others who were thought to threaten the purity of the Nordic ideal advanced by Grant and others. Such ideals were construed as public health policies in Germany, backed by physicians in the name of national health. Those policies were continuous with—and in fact sometimes based on—policies arising from American eugenic programs (Kühl 1994, Whitman 2017). As late as 1934, American physicians in favor of forced sterilization laws lamented that “The Germans are beating us at our own game” (cited in Kevles 1985).

The eventual reaction against eugenics was based partly on collective horror of the atrocities of the Holocaust. In addition to this political change in temperament, there were also scientific repudiations of eugenics, notably from anthropologist Franz Boas and biologist Theodotious Dobzhansky. Dobzhansky argued that natural selection maintains variation in population, and that such variation is biologically beneficial. Accordingly, the reduction of such genetic variation via eugenics would be disastrous (Beatty 1994, Paul 1994). In this way, Dobzhansky became one of the predominant critics of eugenics and defenders of human diversity.

4. Science and Gender

Gender ideologies are often visible in the history of theorizing the natural basis of sex (Tuana 1989, Keller and Longino 1989). Aristotle, a progenitor of biological science, writes that being a woman is essentially a deficiency, being a kind of incomplete male. In a series of psychological, anatomical, and physiological comparisons, he contrasts male and female organisms, typically highlighting females’ inferiority. Women are not only “less perfectly formed” than men, but they are even “mutilated” versions of men. Bewilderingly, given that he was such a careful observer, he even wrote that women have fewer teeth than men. For Aristotle, being female is often defined in terms of the female’s incapacities: to concoct blood, to produce semen, or to convert menses into something better. On the topic of reproductive contributions of males and females, he theorized that men pass on the “active principle” of the human form through their semen, whereas women contribute the passive material causes of the embryos.

Aristotle’s biological work was hugely influential for many centuries, and even later scientists noteworthy for challenging Aristotle’s authority still reaffirmed his traditional Greek view that women are biologically inferior to men (Lloyd 1983, Merchant 1990). The case of reproductive physiology is again illustrative. The Roman physician Galen, for example, attributed formal and material causes to both males and females, but nevertheless insisted on female inferiority because of their “imperfect” semen and because their genitalia were internal. Seventeenth century thinkers continued this line of research bolstering male superiority. William Harvey, most famous for his discovery of blood circulation, assigned efficient causes to both male and female reproductive powers, but still insisted that the male was “the superior and more worthy progenitor” (cited in Merchant 1990). Such work supported a predominant belief in Early Modern Europe that males were progenitors while females were essentially incubators.

These cases also illustrate how being female was interpreted as deviation from norm, best, or perfect. That womanhood was theorized as an alterity reflects an important fact about the homogenous population doing the theorizing for most of the history of science, namely, that they were all men.

According to some 19th century psychologists, paleontologists, and anthropologists, women are more infantile, immature versions of men. Whether based on measurements of cranial volume or psychological development, the view here was that women exist in a childlike stage from which males would outgrow. Moreover, according to this thinking, women are biologically closer to animals and the “savage.” German zoologist and physiologist Carl Vogt wrote, “The female European skull resembles much more the Negro skull than that of the European man…[W]henever we perceive an approach to the animal type, the female is nearer to it than the male” (quoted in Russett 1989). Notice the confluence here with the above section on race, where evolutionary narratives were used to establish European supremacy; similar narratives were used to establish male supremacy (Milam 2010).

The physical sciences were also relevant for investigations into gender. In the wake of successful developments in thermodynamics and energy conservation, proponents of “limited energy theory” sought to explain sex differences in the human developmental process. Harvard physician Edward Clarke theorized that strenuous work in one part of the body limited ability and development of other parts of the body. “The brain cannot take more than its share without injury to other organs. It cannot do more than its share without depriving other organs of that exercise and nourishment which are essential to their health and vigor” (Clarke 1873). Limited energy theory had important ramifications for educational practices, according to Clarke, since women who sought the same educations as men diverted their energies from their bodies to mental work, thus risking “neuralgia, uterine disease, hysteria, and other derangements of the nervous system” (1873). Clarke warned that giving men and women equal educations threatened the very survival of the human species. While such theories might seem humorously arcane today, they were partly responsible for excluding generations of women from higher education.

More recent biological sciences, too, have been liable to rely on cultural gender prejudices when describing reproductive behavior and anatomy. Many have detected common Victorian gender prejudices in Darwin’s work, especially his writing on sexual selection (Roughgarden 2009, Richards 2017). The stereotype of the passive female and the adventurous, competitive male has proved remarkably enduring, apparently making its way into late 20th century cell biology. One consequence was an overemphasis on the passivity of the female egg during fertilization: The most influential cell biology textbook of the era described how “an egg will die within hours unless rescued by the sperm” (cited in Martin 1991). Such stereotypical metaphors, aligning with widespread gender ideologies, could impede science to the extent that they hinder investigations or descriptions at odds with culturally entrenched ideas. Indeed, subsequent discoveries of the egg’s active roles in fertilization were nevertheless slow to change biologists’ descriptions. Alternatively, such metaphors could unwittingly naturalize human cultural norms and make them seem unquestionable: “That these stereotypes are now being written in at the level of the cell constitutes a powerful move to make them seem so natural as to be beyond alteration” (Martin 1991).

One further aspect of gender is sexuality, and psychiatric science has shaped—and been shaped by—sexual norms and ideologies. Late 20th century typologies of disease, notably the official manual of mental health known as the Diagnostic and Statistical Manual of Mental Disorders (DSM), pathologized homosexuality in an era when it was considered deviant. According to that standard, homosexuality was officially a psychiatric illness in the United States from 1952 to 1973, and variant categories of homosexuality persisted in the DSM through 1987. While homosexuality has since been de-pathologized in the medical community, some religious communities continue to advocate “reorientation therapy” to treat what they consider the malady of homosexuality (Waidzunas 2015). The history of many mental health disorders has been closely associated with social trends; perhaps being mentally healthy may often depend on social attitudes about the acceptable range of normalcy and variation.

5. Science and Religion

Religions can form the basis of totalizing belief systems encompassing cosmology, theology, politics, and ethics, and so for some theorists, religion is the quintessential ideology. Marx famously called religion “the opiate of the masses” and thought it was precisely the kind of ideology from which people needed liberation in order to understand power dynamics as they truly are. He thought that religions like Christianity served the interests of the ruling classes by placating adherents, making them less willing to acknowledge and confront manifest injustices by deferring justice to an afterlife rather than establishing a more equitable society on earth.

Accordingly, if religion is a typical ideology, then a familiar narrative contrasts religion with science, supposing they are locked in essential conflict with each other. This notion looms large in the popular imagination, and conflict is especially apparent as it has related to the interpretation of religious scriptures. Galileo’s condemnation by the Catholic Church partly involved the church’s resolution to control the interpretation of scripture, which was especially salient during the Counter-Reformation following the Council of Trent. The book of Joshua records that God stopped the sun (presumably from moving around the Earth), which the Church interpreted as evidence for a geocentric planetary order. Galileo suggested an alternative interpretation of the passage that was compatible with heliocentrism, but religious authorities of the 17th century were reluctant to let an outspoken astronomer dictate the correct meaning of scripture.

While strictly literal interpretations of scripture have not been standard in the Christian tradition, some Christians’ opposition to evolutionary theory today likewise hinges on their literal interpretation of religious texts, which they say describes how the world was created in seven days in the year 4004 BC, according to a traditional 17th century chronology by Bishop James Ussher. Evolutionary theory, positing species transmutation and an enormously extended historical timescale, found mixed reception among Christians in different times and places. In America, Darwinian evolution did not meet much resistance until the 1920s, when some Christian evangelicals and fundamentalists linked evolution with threats to favored theological and moral orders. At that time, there was little debate about the status of organic evolution among professional biologists or among most religious leaders, but its tenability was soon called into question especially as a way to influence secondary school curricula. It was in this connection that evolution became the topic of globally publicized courtroom dramas: first as the 1925 Scopes “Monkey” trial on whether evolution was allowed in a Tennessee classroom, and later as the 1982 United States Supreme Court decision on whether creationism was allowed in an Arkansas classroom (Ruse 1988). When creationism was judged to be a religious rather than scientific theory, and thus ruled out of biology classes, it morphed into intelligent design theory, which focused less on advancing specifically Biblical explanations, and more on challenging the status of evolutionary theory. Since the 1970s, creationists and intelligent design theorists alike sought intellectual support from scientific and philosophical resources including Francis Bacon, Karl Popper, and Thomas Kuhn to argue that their preferred version of science was on equal footing with evolutionary theory (Numbers 2006).

Antievolutionism was not led primarily by churches but by individuals like William Jennings Bryan and George McCready Price. Bryan was the populist politician, a three-time presidential candidate of the Democratic party, who battled evolution at the Scopes trial. For Bryan, evolution was associated with moral decay and a decline of Biblical authority. Bryan thought Darwinism was implicated in the militant German nationalism of World War I and the decrease in religious belief among college-educated Americans. Despite Bryan’s renown as an opponent of evolution, he was primarily concerned with protecting the supernatural origin of humans, and in fact had no qualms with evolution in general or the standard reading of “days” in the book of Genesis as extended periods of time compatible with geological findings. Young-Earth creationism was born as the “flood geology” of Seventh-day Adventist George McReady Price, who posited a literal six-day creation narrative and a young-Earth chronology. While this was not the traditional Christian interpretation of Genesis, Price advocated for this strictest version of creationism because he was following the teachings of Adventist founder Ellen G. White, who claimed divine inspiration for her view that the fossil record was the result of the Noachian flood. While much of the rhetoric among creationists has focused on matters of Biblical interpretation, the fact that such strident literalist antievolutionism took form only in the 1920s, and did not catch on with a broader public until the 1960s, suggests that creationism is at least partly explained by social and political conditions unique to those periods, such as some Christians’ rejections of what they considered modernity’s excesses.

The supposition that there is an essential conflict between science and religion is often founded on the premise that they are pursuing the same goals—say, the true description of the world—and so they are competing for the same territory. One narrative based on that notion of shared goals has it that science is displacing religious explanations of natural phenomena: Where mythological or religious explanations once sufficed, we now have true scientific explanations. However, the premise that science and religion share the same goals has been disputed from various quarters. Biologist Stephen Jay Gould argued that science and religion are “non-overlapping magesteria,” two realms concerned with two separate subject matters: science with facts and religion with values (Gould 1999, see also Brooke 2016). Reformed theologian Karl Barth, arguing from a very different perspective, theorized how science and religion rest on wholly separate foundations: science on empirical reality and religion on revelation. Such arguments are sensitive to the ways that sciences and religions evince distinctive ends and practices; perhaps they do not share the same goals after all.

If science and religion sometimes pursue separate goals with separate methods, then this dimishes the emphasis on conflict. Historically, at least, the emphasis on conflict is an incomplete way to tell the story of science and religion. It was not a common way to think of the relationship of science and religion until recently. The “conflict narrative,” as it is known by historians, dates only from the late 19th century, from influential if methodologically flawed history texts by John William Draper and Andrew Dixon White. No such totalizing conflict was perceived for most of the history of science (Brooke 1991, Numbers 2009, Harrison 2010).

While the sources of modern sciences are diverse, reaching back to ancient Greek and medieval Arabic and European roots, modern sciences were institutionalized in an overwhelmingly Christian Europe in the 17th century (see also Effron 2010). It would have been quite surprising, then, if this new “mechanistic philosophy,” as it was then known, was considered irreligious. It was not. Many of the architects of modern sciences were themselves Christians of one stripe or another, in whose minds there was no conflict between their own scientific and religious practices. To the contrary, for most of these early scientists, doing science was a pious activity especially befitting the religious, insofar as coming to know God’s creation was a way of coming to know the Creator. The tradition of natural theology, which sought to infer the existence or attributes of the Creator through the design apparent in the creation, was a religious framework for doing science for centuries (Re Manning 2013, Topham 2010). Kepler, Galileo, Newton, and many others believed that doing science amounted to deciphering the “book of nature”—a common theological metaphor that placed scientific investigation alongside the study of religious scripture. Robert Boyle, the 17th century chemist and namesake of Boyle’s Law, labored to ensure that the new mechanistic philosophy was not seen as threatening religious belief, but rather as more compatible with Christian theology than the reigning Scholastic approach of his time. In one passage, Boyle even advocated performing experimental science on the Sabbath, as it could be considered a form of worship (Davis 2007).

Accordingly, the conflict narrative does not capture most of the history of science and religion. Science advanced not despite, but often because of its religious significance to early scientists. As one historian writes, “a distinctive feature of the Scientific Revolution is that, unlike other earlier scientific programs and cultures, it is driven, often explicitly, by religious considerations: Christianity set the agenda for natural philosophy in many respects and projected it forward in a way quite different from that of any other scientific culture” (Gaukroger 2006). Impulses arising from within religious movements spurred and shaped the formation of natural sciences (Harrison 1998).

If contemporary historians reject the conflict view relating science and religion, they have adopted a more nuanced position known simply as the complexity thesis, which states that there is no single relation between science and religion. Such complexity should be entirely expected if science and religion are not stable, monolithic entities with timeless essences, but instead are labels for diverse, dynamic traditions of thought and practice. Consider briefly that there is no essential element shared across all religions—not even a general one such as belief in gods. It should not be surprising, then, that all those things called religions might not have a single relationship with science. Such complexity, then, provides a warning sign for all studies of science and religion: Sweeping narratives that so readily lend themselves to ideological or rhetorical purposes often ignore complexity at the cost of historical accuracy.

6. Science as Ideology: Scientism

Finally, it is worth noting a sense in which science itself can form a basis of an ideology. When science is credited as the one and only way we have to describe reality, or to state truth, such restrictive epistemology might graduate into scientism. According to this view, the only rationality is scientific rationality. Poetry, literature, music, fine art, religion, or ethics could not be considered sources of knowledge, according to this view, because they are not generated by scientific methods. Such fealty to the deliverances of science, especially at the expense of other ways of knowing, can become ideological, and scientism is the preferred description of such a view. While enthusiasm for science has been a part of its ethos since the Enlightenment, scientism goes beyond enthusiasm in its insistence that whatever falls outside the scope of science is not knowledge. Alternatively, scientism is sometimes used to refer more specifically to the uses of science to inform policy. If political issues are framed as scientific, so that scientific evidence alone can adjudicate the right policies, it constitutes a strongly technocratic move to replace politics with science, and such replacement can also be a form of scientism.

The use of the label “scientism” typically implies a negative judgment about a problematic fidelity to science, but a few theorists have embraced the label as well. There is no simple relationship between science and scientism. Many scientists reject scientism, while some humanities scholars promote it. When humanists decide they ought to work within a metaphysics they imagine to be scientific, they may feel compelled to adopt a materialist or reductionist framework rejecting traditional categories of humanistic inquiry, such as person, will, freedom, judgment, or agency. Insofar as natural sciences might not recognize those categories, some humanistic scholarship has been transformed—some would say attenuated—by the loss of such concepts (Pfau 2013).

We can identify at least four challenges for scientism. First, an overweening loyalty to science and rejection of nonscience may presuppose that such categories have discrete boundaries. As noted in Section 1, however, the longstanding attempt to characterize science through a definition or definitive methods has been largely unsuccessful. It has proven incredibly difficult to specify exactly what makes an approach to the world scientific, which obviously problematizes the derogation of nonscience. Second, the appeal to science can obscure the question of which parts of science are being drawn upon. If science consists of a variety of distinctive practices, answering many different questions with many different methodological approaches, then appeals to science simpliciter can obfuscate important questions about which science is being included, which omitted, and how it is analyzed. This is important because different scientific studies and methods often do not align to provide straightforward results: Separate analyses even of the very same data can yield remarkably divergent conclusions (Stegenga 2011). Third, proponents of scientism sometimes marshal their own scientific credentials to back their claims. In a society that grants so much cultural authority to scientists, those credentials can easily bestow rhetorical power. Nonetheless, scientific expertise does not automatically entail expertise in other areas, and it has proved all too easy for, say, some biologists to make philosophical and theological pronouncements without training in, or even appreciation for, those other fields of study. A fourth challenge faces scientism as a replacement for politics; the problem is that political debates are typically not exhausted by their scientific dimensions. Issues like climate change or race relations, for example, involve more than scientific results; they also include conceptions of justice, freedom, economics, and even religion, which are each infused with ethical concerns. Politics cannot be reduced to technical scientific problems, and so the attempt to convert essentially ideological debates into straightforward scientific hypotheses can misconstrue what is at stake and overlook important issues under debate (Oakeshott 1962, Bernstein 1976, Seliger 1976).

Insofar as science’s powers are rooted in methods aimed at studying nature independent of any ideologies, this also represents a limit to its application. While scientific inquiry can contribute to nearly any problem we face, science typically cannot determine the solutions to those problems on its own; to think otherwise is to fall prey to scientism. Most real-world problem solving involves more than just applying scientific results; it also involves complex philosophical and ethical judgments, whether or not those are explicitly articulated.

7. Conclusion

Although it is often lamented whenever science is politicized, this article shows how frequently scientific knowledge has been intertwined with broader social and political concerns. History does not entail that such politicization is acceptable or inevitable. History does suggest it is nothing new. So long as we believe that science will matter to the things we care about most deeply, we should expect such contestations to continue in the future. Seen this way, ideological debates over science illustrate just how central science is in the modern world. Ideologically-contested science is not a sign that we fail to value science; to the contrary it shows us just how much all partisans agree that science is central to their advocacy. Of course, this can be problematic if science is misrepresented in order to justify particular interests.

Ideologues have often claimed science to be on their side. That is not surprising, given the cultural status of science, and given that ideologies are usually informed by some factual, putatively scientific claims. This article has shown how science has been used to support various ideologies.

It has also shown how ideologies can make their way into science. In the West, science has often been shaped by dominant ideologies which have privileged the white, the male, and the heterosexual, while demoting or pathologizing non-Europeans, women, and homosexuals. It seems clear that scientists have sometimes drawn on widely shared social beliefs when they are doing science, and that such ideologies can influence their science. Thus, it is problematic, to say the least, when those scientific results are then cited as independent evidence for the ideologies themselves (Lewontin 1992).

On the other hand, science has also been used as a check or bulwark against inhumane ideologies, such as Darwin’s fight against the slave trade or Dobzhansky’s arguments against eugenics. In these ways, ostensibly scientific disputes can also be sites of adjudicating ideological conflict, though such adjudication necessarily draws on more than just scientific data.

If ideologies can be assimilated into science, science has also challenged traditional beliefs and ideologies. As one classicist argues, “Ancient science is from the beginning strongly marked by the interplay between, on the one hand, the assimilation of popular assumptions, and, on the other, their critical analysis, exposure and rejection, and this continues to be a feature of science to the end of antiquity and beyond” (Lloyd 1983). Science and ideologies can adjust to one another, and this process is ongoing.

A close look at the history of science makes any clean-cut division between science and ideology appear artificially imposed. The history of science instead engenders a sense for the complex assortment and rearrangement of ideas that can problematize any straightforward isolation of the scientific from the ideological. Indeed, most contemporary historians and sociologists of science make sense of scientific changes partly by recognizing science’s permeability to cultural pressures. Political and religious frameworks can influence the questions scientists ask, which research they take to be significant, how they assess its importance, and even how long particular problems are worth pursuing.

As one historian put it, “The lines between science, ideology and world view are seldom tightly drawn” (Greene 1982). The point is that science has historically been enmeshed with social trends and beliefs that include ideologies. Historian Bob Young went so far as to claim that ideology is pervasive: “Ideology is an inescapable level of discourse” (Young 1971).

While the historical cases sketched above are well documented, the philosophical conclusions we might draw from them remain contested. For instance, one view is that they are unfortunate instances of science gone bad. Another is that perhaps they are cases where science is corrupted or objectivity is compromised. Optimistically, we might learn from them and try to remain more unbiased or ideologically neutral in the future. Perhaps self-awareness about our own social and political values will help secure more objective science.

However, it is possible that it will remain difficult to fully recognize exactly how broader patterns of thought, including background assumptions that are ideological in nature, influence scientific theorizing. Recent cognitive studies of implicit bias indicate that humans operate with biases they often do not recognize and which are difficult or impossible to eliminate. It remains to be seen how such biases might influence scientific theorizing. As was noted in section 1, ideologies are often difficult to recognize—especially in oneself—but their critical analysis is important not just for politics but for science as well.

Because ideologies are held by everyone, including scientists, they can sometimes explain why some scientific hypotheses are not pursued, while others are pursued or accepted uncritically. In his published writings at least, Darwin seems to have rejected out of hand the hypothesis that women could be cognitively equal to men; such equality would seem extremely implausible given the Victorian gender norms that Darwin generally shared. For other scientists, hypotheses such as the genetic determination of intelligence have been uncritically accepted because they fit a favored ideological narrative (Richardson 1984).

It is possible that ideologies find their way into science more effectively among homogenous groups of scientists. Examples such as the longstanding research program of white men asking why women and minorities were so much less intelligent are at the very least suggestive. Who is doing the science may very well influence what scientific questions are asked, which of course relates to what conclusions are reached. Some philosophers argue that more diverse groups of inquirers can foster objectivity. On this view, the lack of diversity in science is no mere political or moral problem, but an epistemic problem. Insofar as modern sciences are no longer primarily the pursuit of individuals, but a collective enterprise to be analyzed at the community level, then objectivity might best be achieved among groups with different backgrounds or life experiences (Longino 1990). Analyses of the relationship between social position and scientific knowledge were pioneered by feminist philosophers but have since become mainstream (Richardson 2010). Some empirical evidence indeed suggests that ethnic and geographic diversity among researchers can improve scientific results (Adams 2013; Freeman and Huang 2014).

8. References and Further Reading

  • Adams, Jonathan. 2013. “Collaborations: The fourth age of research.” Nature 497: 557-560.
  • Arendt, Hannah. 1973. The Origins of Totalitarianism. New York: Harcourt Brace Jovanovich.
  • Baker, Jennifer L., Charles N. Rotimi, and Daniel Shriner. 2017. “Human ancestry correlates with language and reveals that race is not an objective genomic classifier.” Scientific Reports 7: 1572.
  • Beatty, John. 1994. “Dobzhansky and the Biology of Democracy: The Moral and Political Significance of Genetic Variation.” In The Evolution of Theodosius Dobzhansky, edited by Mark B. Adams. Princeton: Princeton University Press.
  • Bernal, J. D. 1939. The Social Function of Science. New York: The Macmillan Company.
  • Brandt, Allan M. 2012. “Inventing Conflicts of Interest: A History of Tobacco Industry Tactics.” American Journal of Public Health 102 (1): 63–71.
  • Bernstein, Richard J. 1976. The Restructuring of Social and Political Theory. New York: Harcourt Brace Jovanovich.
  • Brooke, John Hedley. 1991. Science and Religion: Some Historical Perspectives. Cambridge: Cambridge University Press.
  • Brooke, John Hedley. 2016. “Order in the Relations Between Religion and Science? Reflections on the NOMA Principle of Stephen J. Gould.” In Rethinking Order, edited by Nancy Cartwright and Keith Ward. London: Bloomsbury Academic.
  • Carnegie, Andrew. 1920. Autobiography of Andrew Carnegie. Boston: Houghton Mifflin.
  • Chesterton, G.K. 1922. Eugenics and Other Evils. London: Cassell and Company, Limited.
  • Clarke, Edward. 1873. Sex in Education. Boston: James R. Osgood and Company.
  • Davis, Edward B. 2007. “Robert Boyle’s Religious Life, Attitudes, and Vocation.” Science & Christian Belief 19 (2): 117-138.
  • Desmond, Adrian and James Moore. 2009. Darwin’s Sacred Cause. Boston: Houghton Mifflin Harcourt.
  • Douglas, Heather. 2009. Science, Policy, and the Value-Free Ideal. Pittsburgh: University of Pittsburgh Press.
  • Effron, Noah. 2010. “The Myth that Christianity Gave Birth to Modern Science.” In Galileo Goes to Jail and Other Myths about Science and Religion, edited by Ronald L. Numbers. Cambridge, MA: Harvard University Press.
  • Freeden, Michael. 2003. Ideology: A Very Short Introduction. Oxford: Oxford University Press.
  • Freeman, Richard B. and Wei Huang. 2014. “Collaboration: Strength in diversity.” Nature 513: 305.
  • Gannett, Lisa. 2004. “The Biological Reification of Race.” The British Journal for the Philosophy of Science 55 (2): 323–345.
  • Gaukroger, Stephen. 2006. The Emergence of a Scientific Culture: Science and the Shaping of Modernity, 1210-1685. New York: Oxford University Press.
  • Gould, Stephen Jay. 1996. The Mismeasure of Man. New York: W.W. Norton & Company.
  • Gould, Stephen Jay. 1999. Rocks of Ages: Science and Religion in the Fullness of Life. New York: Library of Contemporary Thought.
  • Graham, Loren. 2016. Lysenko’s Ghost: Epigenetics and Russia. Cambridge: Harvard University Press.
  • Hacking, Ian. 1983. Representing and Intervening. Cambridge: Cambridge University Press.
  • Harrison, Peter. 1998. The Bible, Protestantism, and the Rise of Natural Science. Cambridge: Cambridge University Press.
  • Harrison, Peter (ed.) 2010. Cambridge Companion to Science and Religion. Cambridge: Cambridge University Press.
  • Keller, Evelyn Fox and Helen E. Longino, eds. 1996. Feminism and Science. Oxford: Oxford University Press.
  • Kitcher, Philip. 2007. “Does ‘Race’ Have a Future?” Philosophy and Public Affairs 35 (4): 293-317.
  • Kevles, Daniel. 1985. In the Name of Eugenics. Cambridge, MA: Harvard University Press.
  • Kühl, Stefan. 1994. The Nazi Connection: Eugenics, American Racism, and German National Socialism. New York: Oxford University Press.
  • Lewontin, R. C. 1992. Biology as Ideology. New York: HarperCollins.
  • Lloyd, G. E. R. 1983. Science, Folklore and Ideology. Cambridge: Cambridge University Press.
  • Longino, Helen. 1990. Science as Social Knowledge. Princeton: Princeton University Press.
  • Martin, Emily. 1991. “The Egg and the Sperm.” Signs 16 (3): 485-501.
  • Marx, Karl and Friedrich Engels. 1938. The German Ideology. London: Lawrence & Wishart.
  • Merchant, Carolyn. 1990. The Death of Nature. New York: Harper Collins.
  • Merton, Robert K. 1942. “A Note on Science and Democracy.” Journal of Legal and Political Sociology 1: 115-126.
  • Milam, Erika Lorraine. 2010. “Beauty and the beast? Conceptualizing sex in evolutionary narratives.” In Biology and Ideology from Descartes to Dawkins, edited by Dennis R. Alexander and Ronald L. Numbers. Chicago: University of Chicago Press.
  • Mottier, Véronique. 2010. “Eugenics and the State: Policy-Making in Comparative Perspective” in Bashford, Alison and Philippa Levine, eds. The Oxford Handbook of the History of Eugenics. Oxford: Oxford University Press.
  • Numbers, Ronald L. 2006. The Creationists. Cambridge: Harvard University Press.
  • Numbers, Ronald L., ed. 2008. Galileo Goes to Jail and Other Myths about Science and Religion. Cambridge: Harvard University Press.
  • Oakeshott, Michael. 1962. Rationalism in Politics and Other Essays. London: Methuen & Co Ltd.
  • Oreskes, Naomi and Erik Conway. 2010. Merchants of Doubt. New York: Bloomsbury Press.
  • Paul, Diane B. 1994. “Dobzhansky in the “Nature-Nurture” Debate.” In The Evolution of Theodosius Dobzhansky, edited by Mark B. Adams. Princeton: Princeton University Press.
  • Pfau, Thomas. 2015. Minding the Modern. Notre Dame: University of Notre Dame Press.
  • Plekhanov, Georgi. 1956. The Development of the Monist View of History. Moscow: Foreign Languages Publishing House.
  • Proctor, Robert N. 1988. Racial Hygiene. Cambridge, MA: Harvard University Press.
  • Proctor, Robert N. 2012. “The history of the discovery of the cigarette-lung cancer link: evidentiary traditions, corporate denial, global toll.” Tobacco Control 21 (2): 87-91.
  • Re Manning, Russell. 2013. The Oxford Handbook of Natural Theology. Oxford: Oxford University Press.
  • Richards, Evelleen. 2017. Darwin and the Making of Sexual Selection. Chicago: University of Chicago Press.
  • Richardson, Robert C. 1984. “Biology and Ideology: The Interpenetration of Science and Values.” Philosophy of Science 51 (3): 396-420.
  • Richardson, Sarah S. 2010. “Feminist philosophy of science: history, contributions,
  • and challenges.” Synthese 177 (3): 337–362.
  • Roll-Hansen, Nils. 2005. The Lysenko Effect: The Politics of Science. New York: Humanity Books.
  • Rosen, Christine. 2004. Preaching Eugenics. New York: Oxford University Press.
  • Roughgarden, Joan. 2009. The Genial Gene: Deconstructing Darwinian Selfishness. Berkeley: University of California Press.
  • Ruse, Michael (ed). 1988. But Is It Science? The Philosophical Question in the Creationism/Evolution Controversy. Buffalo: Prometheus Books.
  • Russett, Cynthia Eagle. 1989. Sexual Science. Cambridge: Harvard University Press.
  • Seliger, Martin. 1976. Ideology and Politics. Lonon: George Allen & Unwin Ltd.
  • Stegenga, Jacob. 2011. “Is meta-analysis the platinum standard of evidence?” Studies in History and Philosophy of Biological and Biomedical Sciences 42 (4): 497–507.
  • Topham, Jonathan R. 2010. “Biology in the Service of natural theology: Paley, Darwin, and the Bridgewater Treatises.” In Biology and Ideology from Descartes to Dawkins, edited by Dennis R. Alexander and Ronald L. Numbers. Chicago: University of Chicago Press.
  • Tuana, Nancy. 1989. Feminism and Science. Bloomington: Indiana University Press.
  • Waidzunas, Tom. 2015. The Straight Line. Minneapolis: University of Minnesota Press.
  • Winther, Rasmus and Jonathan Kaplan. 2013. “Ontologies and Politics of Biogenomic ‘Race.’” Theoria 136 (60), No. 3: 54-80.
  • Witherspoon, D. J., S. Wooding, A. R. Rogers, E.E. Marchani, W. S. Watkins, M. A. Batzer, and L. B. Jorde. 2007. “Genetic Similarities Within and Between Human Populations.” Genetics 176 (1): 351–359.
  • Young, Bob. 1971. “Evolutionary Biology and Ideology: Then and Now.” Science Studies 1: 177-206.
  • Yudell, Michael, Dorothy Roberts, Rob DeSalle and Sarah Tishkoff. 2016. “Taking race out of human genetics” Science 351 (6273): 564-565.

 

Author Information

Eric C. Martin
Email: eric_martin@baylor.edu
Baylor University
U. S. A.

Analytic Perspectives in the Philosophy of Music

musical notesThe philosophy of music attempts to answer questions concerning the nature and value of musical practices. Contemporary analytic philosophy has tackled these issues in its characteristically piecemeal approach, and has revived interest in questions about the ontological nature of musical works, the experience of musical expressiveness, the value of music, and other considerations. Priority is normally granted to the philosophical clarification of pure (or absolute) music, that is, music that is not accompanied by lyrics or a program and is otherwise lacking any reference to extra-musical reality. This is because most of the puzzles in the philosophy of music arise with particular strength in the case of pure music. For instance, although it is easy to explain why we would describe as “sad” a song with lyrics conveying a sad story, it is harder to see why we would call a piece of instrumental music “sad.” Unless otherwise stated, the word “music” in this article refers to pure music, that is, instrumental music.

While it would be hard to point to uncontroversial solutions to any of these problems, this is not to deny that substantial conceptual clarifications have been made. In the case of musical expressiveness, a fundamental distinction has been traced, and is widely accepted, between the expression of emotions as the manifestation of psychological states and expressiveness as the mere presentation of the outward characteristics associated with emotions. Conflating the former with the latter gives rise to the mistaken assumption that emotional descriptions of music must refer to an actual emotional state either in the listener or perhaps in the composer.

The field of musical ontology is largely a reflection of debates in general ontology, although some issues are peculiar to the musical case. For instance, philosophers have debated whether the differences in appreciative focus across musical traditions warrant a different ontological characterisation of works in those traditions. Consider the case of rock music: the main focus is often the record as opposed to the live performance of the piece, which is arguably the critical focus in the Western classical tradition. This may suggest that we ought to construe the work of rock music as ontologically different from the work of classical music, as the former is a track, whereas the latter is a work for performance.

Finally, analytic philosophy of music has attempted to solve the riddle of musical value: how is pure music valuable to our lives if it makes no reference whatsoever to our world? The most original solutions to this problem have tried to show that it is precisely the music’s abstractness that explains its value and appeal.

Table of Contents

  1. Definitions of Music
    1. Definitional Proposals
    2. Related Issues
  2. Musical Expressiveness
    1. Two Basic Distinctions
    2. Accounts of Musical Expressiveness
      1. Arousal Theory
      2. Resemblance Theories
      3. Persona Theory
      4. Other Accounts
    3. Literalism vs. Metaphoricism
    4. Emotions Aroused by Music
      1. The Sceptical View
      2. Emotional Contagion
      3. Negative Emotions
  3. Ontology of Music
    1. Fundamental Ontology
      1. Nominalism
      2. Platonism
      3. Sceptical Views
    2. Comparative Ontology
      1. Rock
      2. Jazz
      3. A Sceptical View
    3. Performance Authenticity
  4. Musical Understanding
    1. Concatenationism
    2. Architectonicism
  5. Musical Value and Profundity
    1. Values of Music
    2. Profundity
  6. References and Further Reading

1. Definitions of Music

a. Definitional Proposals

In comparison to the extensive scrutiny devoted to the general definition of art, the definition of music has received little attention. One may be tempted to dismiss the need for a philosophical definition, as music textbooks routinely present definitions of music that are taken to be relatively uncontroversial. However, while music textbooks may be unanimous in defining music as sound sequences that present elements such as melody, harmony, and rhythm, none of these features is necessary for something to count as a piece of music. Moreover, the occurrence of melodic intervals and rhythmic patterns in natural contexts suggest that these features are also insufficient to make something music: there are melodic intervals in birdsong, pitched sounds produced by the howling of the wind, and rhythmic patterns in heartbeats, but none of these should count as music (at least under the reasonable assumption that music requires human agency).

Examine here are two prominent attempts at a definition of music, a sceptical view of those attempts, and issues broadly related to the definitional problem.

Jerrold Levinson starts from the intuitive notion that music is organized sound (“The Concept of Music” 269). While this may seem correct, it does not yield a definition with the intended scope, as it would include human speech, Morse code, animal calls, and countless other non-musical phenomena. A possibility is to amend the definition by specifying that the organized sounds in question are produced for the purpose of aesthetic appreciation. While this would exclude some of the examples mentioned above, it would also fail to include what are arguably central cases of music the purpose of which is not that of being appreciated aesthetically. This is the case for military music, some music accompanying ritual, at least some film music, and other instances of music in which its main function is not related to its aesthetic appreciation. The amended definition would also problematically include sound arts other than music, such as poetry. Levinson believes that these shortcomings may be resolved if we define the purpose of music as the enrichment or intensification of experience achieved through an active engagement with it, where the active engagement may include activities ranging from attentive listening, to dancing, and to marching to the music. Music for dancing, marching, or praying would thus be included in the definition, as our experience is heightened, intensified, or otherwise enriched by our active engagement with organized sounds. To this qualification we must add another one: in music we engage with sounds primarily as sounds. This further caveat is necessary to exclude cases such as spoken poetry, where our engagement with the sounds primarily aims at the linguistic meaning they convey. From these observations we arrive at a definition of music as “sounds temporally organized by a person for the purpose of enriching or intensifying experience through active engagement (for example, listening, dancing, performing) with the sounds regarded primarily, or in significant measure, as sounds” (“The Concept of Music” 273).

Against Levinson’s proposal, Andrew Kania observes that the above definition is too narrow (“Definition” 8). A musician’s daily practice of scales, or a violin tune played to startle a friend in the middle of the night, ought intuitively to count as music, yet they fail to meet the requirements set out by Levinson’s definition: scale practising is not meant to enrich or intensify experience, nor is one’s playing the violin to play a prank on a sleepy friend. More problematically, the whole category of Muzak is excluded by Levinson’s definition (by Levinson’s own admission), as Muzak is not produced with the purpose of enriching or intensifying experience, but rather with that of inducing a particular mood or attitude. Kania observes that this seems to confuse classificatory and evaluative issues: Muzak may be bad music, but it certainly is music.

These cases may tempt one to include in the definition features such as pitch and rhythm, as these may allow us to include the examples unduly excluded by Levinson. But to make these a necessary feature would make the definition too restrictive, in that it would exclude avant-garde music that lacks pitched sounds or a rhythm, such as Yoko Ono’s Toilet Piece (1971), which is constituted by the sound of a flushing toilet. Kania’s strategy to get out of this impasse is a disjunctive definition (“Definition” 12). His proposal reads as follows: “Music is (1) any event intentionally produced or organized (2) to be heard, and (3) either (a) to have some basic musical features, such as pitch or rhythm, or (b) to be listened to for such features” (Kania, “Definition” 11).

Note that the disjunction allows us to include both a musician’s practice routine, which meets condition 3(a), and cases such as Ono’s Toilet Piece, which lack such elements but presuppose that we would listen for such features, as they are typical of most music.

Against these attempts, Jonathan McKeown-Green has argued that definitions attempting to preserve our pre-theoretical intuitions as to what music is may fall short of providing what we reasonably expect from a definition of something. He suggests that definitions such as Kania’s and Levinson’s are ill-equipped to provide a “future-proof” definition of music, as further developments of current musical practices may change folk intuitions in such a way as to make their current definitions unable to include things that future folk intuitions would consider music. While McKeown-Green leaves open the possibility of future methodological refinements that may address these issues, his view casts a sceptical doubt on the definitional enterprise.

b. Related Issues

In addition to these disputes, which target clearly and specifically the definitional issue, other contributions address the question of what music is in more peripheral ways. For instance, Stephen Davies (“John Cage’s 4’ 33””) and Julian Dodd (“What 4’ 33” Is”) discuss the issue of whether silent pieces, such as John Cage’s famous 4’ 33”, should indeed count as music. While they both hold it should not, and prefer to classify it as a non-musical work for performance, they disagree about the nature of the work. According to Davies, 4’ 33” contains the environmental sounds that occur while it is being performed—he compares this to “an empty picture frame that is presented by an artist who specifies that her artwork is whatever can be seen through it” (459). Against this, Dodd holds that the work is merely about those environmental sounds. For, if the work is a work of performance art—something Davies grants—then it is impossible for it to include, as part of its content, sound events that are not performed by the work’s performers (6–8).

Other philosophers have focused on the distinction between natural and musical sounds, or, more generally, non-musical and musical sounds. Roger Scruton (19) distinguishes the latter two by the way we listen to them: we attend to non-musical sounds causally, as we are interested in the sounds’ sources, whereas musical sounds are listened to acousmatically, that is, independently from their sources.

John Andrew Fisher considers causal listening a possibility both in the case of musical sounds and natural sounds, but he draws the distinction between the two by specifying that they are produced by different objects: whereas natural sounds are produced by ecologically natural objects, musical sounds are produced by artefactual objects, such as musical instruments (“The Value of Natural Sounds”). This distinction grounds the otherness that is typical of natural sounds and the experience of inevitability that is associated with them. Additionally, Fisher characterises natural sounds as being attentionally unframed (a natural soundscape does not prescribe privileged focus on a foreground, whereas this happens regularly in the musical case), temporally unframed (a natural soundscape does not have a beginning, midpoint, or end), and unrepeatable (unlike most musical works) (“What the Hills Are Alive With”).

John Dyck has challenged both Scruton and Fisher’s accounts, on the ground that they leave unexplained the way in which natural and musical sounds coexist in sound art. Consider for instance works such as Jon Hopkins and King Creosote’s album Diamond Mine (2011), in which musical moments unfold over a background of environmental sounds. In mixed contexts such as this, we cannot appeal to incompatible ways of listening (causal vs. acousmatic) or incompatible standards of evaluation (attentionally and temporally unframed vs. framed). In other words, a suitable account of the distinction should not explain just the difference between the two types of sounds, but also their interaction. Dyck proposes the following dual distinction: natural and musical sounds differ causally, in that the former are caused by natural objects, the latter by artefactual objects, and acousmatically, in that the former “tend to have a greater variation of microtones, microrhythms, and microtimbres than human environments” (Dyck 298).

2. Musical Expressiveness

a. Two Basic Distinctions

Discussions of musical expressiveness are likely to begin by distinguishing between expressing an emotion and being expressive of an emotion. The distinction is standard since at least Kivy (The Corded Shell, 1980), although it can be found earlier in Tormey (1971). Expressing an emotion means to outwardly manifest a felt emotional state. For instance, I feel sad and express my sadness by weeping and being downcast. For something to be expressive of an emotion, on the other hand, means merely to display the outward manifestations of such an emotion. For instance, a Saint Bernard’s face is expressive of sadness because its snout presents the drooping features associated with sadness, although the dog may be perfectly happy. Similarly, an actor’s behaviour on the stage over the course of a play is expressive of a number of emotions without the actor necessarily going through these emotions himself. This opposition distinguishes expressive contexts that require an actual emotional state—my behaviour is expressing sadness only if I am actually sad—from expressive contexts that do not require such a state—for the actor and the Saint Bernard to look sad, nobody needs to feel actual sadness. Contemporary analytic philosophers are inclined to take music to be an example of the latter case. While the emotions expressed by the music may often be related to actual emotions—such as when listening to a sad song leads us to feel sad—the music is expressive of emotions independently of anyone’s felt emotional state.

Another important, related distinction is between the emotions in the music and those in the listener. Lay people are inclined to confuse conceptually (if not phenomenologically) the emotions aroused by the music with the emotions expressed by the music. Consider this example: a happy song at a party makes someone feel cheerful. The lonely guy in the corner hears the cheerfulness of the song too, yet his depressed mood isn’t affected by it. Or if it is, the music’s happiness may even be a source of frustration. The contrast is between happiness as a state the music induces in the listener and happiness as a state attributed to the music itself. Section 2.b deals with accounts of the latter phenomenon, whereas section 2.c examines philosophical issues related to the former.

b. Accounts of Musical Expressiveness

i. Arousal Theory

While the previous section distinguishes the music’s emotional expressiveness from emotional arousal, an elegant view describes the former as an instance of the latter. In its crudest form, the idea explains the music’s expressiveness of an emotion in terms of the music’s disposition to arouse such an emotional state in a listener. This is the arousal theory of musical expressiveness.

In this basic form, the theory is doomed to failure. On the one hand, some listeners who perceive the music’s expressive character deny ever being moved to feel such emotions themselves. On the other hand, the emotions a piece of music has a disposition to arouse may differ from those we ascribe to the music itself—think again of the guy in the corner, who was frustrated by the music’s happiness. Additionally, the theory cannot explain the way in which expressiveness contributes to the music’s value: if expressiveness is reduced to emotional arousal, then a suitable emotion-inducing drug could supply whatever value is provided by the music’s expressive character. This goes against the intuition that the value of a musical piece’s expressiveness is intrinsically linked to the music and could not be retrieved otherwise. Finally, the arousal theory fails to explain why we would listen to music that is expressive of fear, anguish, or other negative state: if these expressive properties were to be analysed as the music’s disposition to arouse similar emotional states in us, we would probably refrain from listening to such music altogether (more about this in section 2.d.iii).

Derek Matravers defends a version of the arousal theory that he believes capable of facing these difficulties. He claims that the emotions aroused by music are not full-blown emotions, but rather feelings, as they are deprived of the cognitive component typical of emotions. Moreover, Matravers denies that the feeling aroused by the music is always, and only, the one ascribed to the music. Rather, the listener’s emotional response may vary, as does our emotional response to emotions in human beings. Sad music, for instance, is music which normally arouses emotional responses of the sort that would constitute an appropriate reaction to someone’s expression of sadness. These responses are arguably limited, but certainly are not restricted to sadness only. We may for instance appropriately react to sadness with compassion or pity.

While Matravers’ work remains a classic reading in contemporary analytic philosophy of music, his view is normally deemed incapable of solving at least some of the problems that threaten cruder versions of the arousal theory. Justine Kingsbury observes how in other contexts we hardly ever run together the expression of an emotion (or feeling) and its arousal. One may be saddened by other people’s happiness, or worried by someone’s continuous expressions of anger, or feel some sort of Schadenfreude when confronted with expressions of distress. Given the commonplace nature of the conceptual distinction between emotional expression and arousal, it would be weird to think that these should be analysed as equivalent in the musical case.

Matravers would presumably respond to this objection by saying that the two cases are akin as in both cases the appropriate response to emotional expression is an emotional response. We react with sadness (or pity) to someone’s sadness, and we react to sad music in a similar way. But this reply would need to deal with Kingsbury’s other objection: on what grounds can Matravers disqualify as inappropriate the reaction of the listener who does not feel appropriate emotional reactions to sad music? While there are reasons to describe as appropriate the emotional reaction to the misery of another human being, it is unclear in what sense the expressive character of inanimate objects such as musical works requires or invites an emotional reaction.

ii. Resemblance Theories

Resemblance theories of musical expressiveness hold that the music’s expressive properties are due to their resemblance to human expressive behaviour. This is probably the most widely supported philosophical theory of musical expressiveness, and it was first independently proposed by both Stephen Davies (“The Expression of Emotion in Music”) and Peter Kivy (The Corded Shell).

While the two versions of the theory are often discussed together, it is worth stressing their differences. In order to do so, I consider Kivy’s theory first, and then move on to Davies’. After that, I consider objections raised against both views.

The resemblance theory defended by Kivy is known as the contour theory of musical expressiveness. It owes its name to the intuition that the reason why music is expressive of emotions is to be found in the resemblance between melodic contour and human emotional prosody. In other words, music expressive of sadness sounds like human speech when we are in the grip of sadness, and so it acquires its expressive character. According to Kivy, resemblances between music and human behaviour are not limited to vocal behaviour, but also include resemblances to bodily behaviour. Music that is sad moves downwards and slowly, whereas happy music is sprightly and often proceeds by leaps.

According to Kivy, resemblance is not the only source of musical expressiveness. He claims that it is impossible to make sense of the expressive character of some elements of the Western musical tradition on the grounds of their resemblance to human expressive behaviour. His example is that of major and minor chords, which do not resemble in any salient way the vocal or bodily behaviour of happy and sad people, yet are consistently described as happy and sad respectively. Kivy’s solution is to assume that some musical features acquire their expressive character by convention (The Corded Shell 80). This is not unproblematic: how could we successfully establish the conventional connection between sadness and minor chords if these sounded entirely neutral at first?

Davies’ theory is named by its author appearance emotionalism. It holds that music is expressive because it resembles emotion characteristics in appearance, that is, the outward manifestations of human emotions. Davies is inclined to stress the importance of the music’s resemblance to human bodily expressive behaviour, as opposed to vocal (“Artistic Expression” 182). The theory shares with Kivy’s contour theory the idea that music’s expressive character depends on its resemblance to human expressive behaviour and is independent from any actual emotion in the composer or in the listener. Davies points out three main differences between his view and Kivy’s (Musical Meaning 260–267). First, he denies that music, strictly speaking, expresses emotions, as it merely presents the aural appearance of expressive behaviour, and this does not warrant talk of expression. Second, he concedes that music may express Platonic attitudes, that is, emotional states that require an object, such as admiration, pride, or hope. According to Davies, this may be achieved by suitably long and complex musical passages, which convey the succession of feelings and behavioural components typical of such attitudes. Third, Davies claims that music may be about the emotion it expresses, whereas Kivy holds to the formalist view that music isn’t about anything at all. While emotion characteristics in appearance do not by themselves refer to the emotion they are expressive of, they may do so in the appropriate context. Think of using a picture of a Saint Bernard’s sad-looking face to show how you are feeling. In this case, the emotion characteristic in appearance presented by the picture would be referring to your emotional state. Likewise, music can be about the emotions it presents.

A historical note: the intuition that the music’s expressive power lies in its resemblance to human expressive behaviour is an old one and can be traced back to Plato. The resemblance theories proposed by Kivy and Davies advance this idea while at the same time detaching it from the assumption that the emotions in the music had to be related to actual emotional states either in the listener or in the composer. In this resides their main element of novelty.

Resemblance theories have been criticised on numerous grounds. Various commentators have argued that, while they may correctly characterise resemblance to human expressive behaviour as (part of) the explanation of why we hear music as expressive of emotions, they fail to characterise the experience of musical expressiveness (see Levinson “Musical Expressiveness” 195–199, and Matravers ch. 7).

Additionally, Levinson has argued against Davies that appearance emotionalism is unable to describe what would count as the musical presentation of emotion characteristics: “(w)e can give content to ‘sad human appearance’ by glossing it as ‘the appearance sad humans typically display.’ But we can’t analogously give content to ‘sad musical appearance.’ There is no such thing as the appearance or kind of appearance that sad music typically displays” (“Musical Expressiveness” 197).

iii. Persona Theory

Levinson has defended the view that musical expressiveness is essentially the expression of a fictional musical agent, or “persona.” His assumption is that expressiveness can make sense only if it is reduced to some kind of expression: the puzzle of expressiveness is to understand how it is possible for some objects deprived of a psychological life, such as works of music, to be described as possessing psychological properties like happiness or sadness. The riddle is readily solved if we postulate that whenever we hear expressive music, we are hearing it as the expression of emotions in music of a fictional musical agent.

Critics of Levinson’s view tend to stress how competent listeners seem to be able to detect and appreciate the music’s expressive character without any imaginative engagement with a fictional agent they hear in the music (Davies, “Artistic Expression” 189). Levinson’s reply to this is that these processes may often not be conscious. A second, more radical objection to the persona theory holds that, even granting for the sake of the argument that we do in fact hear music as the expression of fictional individuals, a piece of pure music is typically unable to constrain a plausible and coherent narrative about its development. Is the work the expression of a single persona or multiple ones? Is the dialogue between the strings and winds a fight between two imaginary agents or the internal struggle of a single one? In other words, a work of music underdetermines the coherent narratives in terms of musical personae it may elicit. The problem with this is that it is unlikely that all of these narratives will result in a similar verdict with regard to the piece’s expressive character (Davies, “Artistic Expression” 190).

iv. Other Accounts

Jenefer Robinson, in her Deeper than Reason, is noteworthy for the attention she devotes to empirical research on emotions, as well as for her attempt to develop a notion of expressiveness that could be applied to art forms other than music. According to Robinson, highly expressive works of art allow the appreciator to feel what it is like to be in the emotional state the work is expressive of (see, for instance, Robinson 290).

Unlike Levinson, Robinson does not believe that all expressiveness requires an expressing persona. She contends, however, that some music in the Western canon invites such a listening. Relatedly, it is also noteworthy that Robinson is willing to make concessions to the discredited expression theory of expressiveness, according to which a work of art’s expressive properties are due to its creator’s emotional state. As it is, this theory is untenable: we know that artists have created exuberant and joyful works while being depressed, and it is in any case unlikely that an artist will remain in a single emotional state throughout the creation of a complex work of art such as a symphony. Robinson concedes, however, that some musical works, particularly those in the Romantic tradition, may present an emotional state felt by their authors (325). In these cases, we may be justified in identifying the persona in the music as the work’s author.

Charles Nussbaum has defended a sophisticated version of the arousal theory built around the idea that we form a mental representation of a musical work as a virtual terrain. Just like the ordinary space surrounding us, musical space offers affordances, that is, action possibilities. On this view, the arousal of feelings by music is due to off-line motor states that the music puts us into in virtue of our spatial representation of the musical surface (214). Nussbaum’s theory is ambitious and has probably not yet received the sustained consideration it deserves. Some critics have doubted if it could fend off standard objections to cruder versions of the arousal theory (see for instance Trivedi 47–48).

Saam Trivedi in 2017 defended an imaginationist account of musical expressiveness. According to him, the experience of musical expression centrally involves imagination, although it may do so in different ways. The basic way we use imagination in relation to music is to imagine the music itself as a sentient being expressing its emotional states, but other types of imaginative engagement are available (133–139). For instance, we could imagine that the music is the expression of emotions of an indeterminate persona, or that we are ourselves in the emotional states the music is expressive of (139–143).

c. Literalism vs. Metaphoricism

A debate parallel to that concerning musical expressiveness is the one regarding the status of our descriptions of music in emotional terms. When we describe music as “‘sad,” “happy,” and the like, are we speaking literally or, rather, using metaphors in order to grasp aspects of the music that we cannot quite describe in literal terms? The former option is dubbed literalism, whereas the latter can be called metaphoricism.

An early metaphoricist proposal is the one by Nelson Goodman. He claimed that music metaphorically exemplifies expressive properties (85). Suppose you have a new suit made. The tailor shows you swatches of fabric to let you choose your preferred colour and material. The swatches possess a variety of properties, but exemplify only some of them—for instance, they exemplify colour and thickness, but not size. A way to put this is to say that exemplification is possession plus reference. Goodman builds his account of expressiveness on this basic notion of exemplification, with the relevant difference that expressive properties, unlike properties such as colour or size, are not literally possessed by inanimate objects. In the case of expressiveness, then, exemplification is reference to a property that is metaphorically possessed by an object. For instance, a work of music is expressive of sadness if it refers to the property of sadness that it possesses metaphorically.

Goodman’s view has been frequently criticised, especially for the rather obscure notion of metaphoric possession that is central to it (see for instance Davies, Musical Meaning 145–150).

Roger Scruton holds common descriptions of music in spatial and emotional terms to be irreducibly metaphorical. They are metaphorical because they describe in spatial terms something that is not literally extended in space and in emotional and psychological terms something that has no mental states. These metaphors cannot be paraphrased into literal statements, yet they are indispensable because they describe the way in which we imaginatively engage with music. This claim receives support from Scruton’s broader account of musical understanding (see section 4; see Trivedi 67–72 for criticism of Scruton’s metaphoricism).

Against these theorists, Stephen Davies defends a literalist position (“Music and Metaphor”). His strategy is to appeal to the secondary meaning taken by emotion terms when they are used to describe the outward manifestations of emotions. For instance, we may describe a tragic mask as “sad,” and by this we would mean not that the mask is in some actual state of sadness, but rather that it displays the physiognomy associated with sadness. Emotional descriptions of music work in a similar way. When we call a piece of music “sad,” we are using the term in the secondary sense referring to the outward manifestations of sadness, its behavioural correlates, rather than in the primary sense referring to a psychological state. Davies clarifies his view by stressing that the connection between the two uses of the word “sad” (the psychological one, and the behavioural one) is not one of mere homonymy (as in the use of “bank” to indicate both a financial institution and a riverside), but rather an instance of polysemy, that is, of distinct but related meanings (as in the use of “mole” to refer to both a burrowing animal and an undercover agent).

d. Emotions Aroused by Music

There are two main issues related to the emotions aroused by music in listeners. The first is the question as to whether instrumental music may arouse emotions (at least some emotions) and how it may do so. The second is the question as to whether any of these emotions are relevant to the appreciation of music qua music.

i. The Sceptical View

I start from a sceptical view of emotional arousal defended by Peter Kivy (Music Alone ch. 8). While he does not deny that listening to music regularly arouses garden-variety emotions (happiness, sadness, and so on), Kivy denies that any emotion of this sort is relevant to the appreciation or understanding of music as music. This apparently sweeping claim is best understood in light of Kivy’s preferred theory of emotions, that is, a cognitive theory according to which emotions always come with a feeling-state component, an intentional object, and an appropriate belief. Pure music is deprived of the propositional content or extra-musical references necessary to supply a relevant intentional object and belief. So music alone cannot arouse such emotions in us. However, music often gives rise to all sorts of idiosyncratic associations in the listener’s mind. It is these that, according to Kivy, provide the material necessary to the arousal of happiness, sadness, and the like. It is a short step from here to a sceptical position: if garden-variety emotions are aroused by music because of associated content brought to mind by the listening experience, then it is, properly speaking, that content that does the arousal and not the music. Moreover, if the emotional arousal in question is prompted by content that is contingently related to the piece that calls it to mind, then the emotions aroused are irrelevant to the appreciation of the piece. They may in fact be of a completely different character to two different listeners who associate different contents with the piece in question.

There is only one sort of emotion that, according to Kivy, is connected to our appreciation of music. Unsurprisingly, this emotion fits the cognitive view of emotions in that it has an intentional object and a corresponding belief. More precisely, this nameless emotion is one that takes the music as an object and, correspondingly, the belief that the piece is beautiful, well-crafted, skilful, and so forth. Among the properties of the piece that may give rise to such emotional response are also expressive properties. A sad musical work may be beautifully sad, that is, it may express sadness in particularly poignant and well-suited musical means. But this is not to say that the appreciation of such a characteristic is going to arouse sadness in us. Rather, the emotion aroused in these cases is the very same nameless emotion mentioned earlier, a response that takes the music as an object and is sustained by the belief that the music is skilfully, beautifully, and powerfully expressive.

ii. Emotional Contagion

Against the sceptical view, some philosophers hold that arousal of garden-variety emotions is possible without the aid of extra-musical associations. Particularly, those who hold a more liberal view than Kivy’s are inclined to think that music may arouse in the listener the emotions it expresses as happens in the case of emotional contagion from music to listener. This is the position defended by Stephen Davies, who rejects the cognitive theory of emotions. While some emotions may fall neatly in the template described by the cognitive theory, others do not. For instance, we may experience an objectless anxiety or a phobia that lacks the support of any relevant belief. Emotional contagion from music to listener is another example: we catch the music’s emotional state, but the music is not the intentional object of our emotional response (we are not sad about the music, but merely saddened by it; Davies, “Emotional Contagion” 51–52)

Jenefer Robinson’s view is similar to Davies’ in that she holds music to be capable of arousing emotional responses of a mirroring sort. However, she is critical of Davies’s description of the arousal process. In particular, she claims that Davies is mistaken in holding that emotional contagion is the result of a listener’s experience of musical expressiveness (392). According to Robinson, things are quite the opposite, as music is able to induce the emotional states it expresses both before we may realise it expresses them and independently from our capacity to do so. From this point of view, Davies’ description of the mirroring process (or emotional contagion) is unduly heavy on the cognitive side, as it describes contagion as dependent on the listener’s capacity to recognize the music’s expressive character.

Robinson provides an intriguing and empirically informed account of the contagion process. First, music expressive of e may elicit psychological and physiological changes typical of certain moods. Subsequently, the listener may latch onto environmental cues that may supply an intentional object to her emotion. For instance, I may be listening to a happy piece of music, and this may arouse a cheerful mood in me. The mood will convert into a full-blown emotion of happiness when I see something on my desk that reminds me of a friend who is far away but who I will soon get to see. Robinson calls this process the “Jazzercise” effect (391).

Davies is sceptical regarding both Robinson’s objection and her account of the contagion process. Against the worry that he may give too prominent a role to the listener’s recognition of the music’s expressive character, he replies that he does not rule out what he calls “non-attentional contagion,” that is, the unconscious, emotional attuning to expressive features of the environment. He merely believes this to be less central a case than its attentional counterpart (Davies, “Emotional Contagion” 56).

Davies’ criticism of Robinson’s Jazzercise effect questions whether this is a genuine case of contagion from music to listener. If the music merely occasions physiological changes and the corresponding objectless mood, and if these need to be supplemented by environmental cues in order to result in the arousal of an emotion, then the object of our emotion is whatever feature of the environment aroused it. In the above example, if the happiness is prompted by seeing the picture on my desk, then it would seem that we are in the presence of a standard emotion of the cognitive sort, one that does not take the music as an object. (Davies, “Emotional Contagion” 58–60).

iii. Negative Emotions

Recall that one of the standard objections against the arousal theory questions the willingness of listeners to put themselves in negative emotional states by listening to music expressive of such states. And, as we have seen, various philosophers who reject the arousal theory claim nonetheless that music may in fact arouse in the listener the emotions it expresses. It then remains to be seen how they justify the listener’s toleration of, or even attraction to, deeply sad music, if such music has the disposition to arouse in them the negative emotional states it expresses. I examine two prominent answers to this problem.

Levinson considers the music’s expressive character as capable of arousing in the listener the feeling component of emotional states. This falls short of what is required to have a full-blown emotional state, which would require an intentional object and a relevant belief. It is exactly this that makes the musical arousal of emotions a rewarding experience, as the absence of the usual contextual implications for our lives of negative states allows us to relish and explore the phenomenological aspect of these emotions, that is, the feeling component aroused by the music. As Levinson puts it, “(w)e become cognoscenti of feeling, savoring the qualitative aspect of emotional life for its own sake” (“Music and Negative Emotions” 324).

Levinson further claims that additional benefits may be available to the listeners who imaginatively engage with the feeling component aroused by the music and imagine to be themselves in a full-blown state of despair, sadness, or any other negative emotion (“Music and Negative Emotions” 326–329).

Davies is drawn to a more modest but perhaps more effective solution. He observes how many human activities that are valuable and sought after possess an intrinsically unpleasant or painful element—think of weight training or running. Listening to music expressive of negative emotions is one such activity: one of the ways in which we listen to music with understanding is by reacting emotionally to its expressive character, such as when we are made cheerful by happy music or sad by sad music. Because he describes the negative emotional response to sad music as an integral response of our understanding of such music, Davies avoids characterising negative emotional responses as something we endure in order to pursue some goal. He writes: “The response is not an incidental accompaniment but rather something integral to the understanding achieved. It is not something with which one puts up for the sake of understanding; it is an element of that understanding” (Davies, Musical Meaning 312).

3. Ontology of Music

Philosophical reflection on the ontological status of music has tackled three main problems: the fundamental ontological nature of musical works, the possible differences in ontological status of works belonging to different musical traditions, and the issue of what counts as an authentic instance of a piece. The three following sections examine these issues.

a. Fundamental Ontology

We know that the Mona Lisa is a canvas in a large room in the Louvre; likewise, we know that the Palazzo Vecchio is a building in Piazza della Signoria in Florence. These objects seem relatively easy to locate and classify. Musical works, however, are elusive entities. Where is Bach’s Musical Offering, and what kind of thing is it?

Fundamental ontology is mainly concerned with the question as to what sort of entity musical works are, that is, to what ontological category they belong. Dodd calls this the categorial question (“Musical Works” 1114). Are works of music collections of particulars, or are they types that are instantiated by various performances?

Alongside this basic question, musical ontology addresses what Dodd has named the individuation and the persistence questions (“Musical Works” 1114–1116). The former deals with identity conditions: when are we to consider two works as the same? Is identity of notation sufficient? Should we include historical factors, such as its date of composition? The persistence question, on the other hand, concerns a musical work’s coming into being as well as its possible destruction. Do composers create works, or do works exist prior to their composition, in which case they are merely discovered? And under what circumstances, if any, would a musical piece cease to exist?

Views of musical ontology are normally grouped according to the way in which they answer the categorial question—a practice I follow here. However, it is worth observing that pre-theoretical intuitions regarding a work’s identity and its creation or discovery are often decisive in accounting for a philosopher’s preference for one ontological category over another.

i. Nominalism

An early proposal advanced by Nelson Goodman is that we should consider a musical work to be a collection of particulars and, more specifically, as a set including all of the work’s correct performances. This view is appealing to those who, like Goodman, intend to avoid commitment to entities other than particulars. However, it runs into rather obvious problems. First, nominalism seems to convert contingent facts regarding a work’s performances into facts about the work of music they are performances of. For instance, suppose I write a piece of music for guitar this afternoon. I then perform it three times, but every time my performance contains a wrong note on bar 8, as the passage exceeds my technical abilities. The nominalist view would seem forced to draw the absurd conclusion that the piece itself contains a wrong note, as the composition exists only as the set of my three defective performances. Alternatively, the nominalist might embrace the equally counterintuitive view that the work in question has never been performed, for all of the available performances are defective and so do not really count as performances of the work.

A second, even more serious worry, takes the form of a modal objection. It is arguably contingent for a work of music to have been performed a certain number of times. There is a possible world in which Thelonious Monk’s Straight, No Chaser has been performed two more times than it has in ours, and others in which it has been performed eight times fewer. But if we construe works of music as sets, a problem arises, for sets necessarily have just the members they do. The incapacity of accounting for this modal characteristic of the relation between a work and its performances seems to doom the nominalist project to failure.

ii. Platonism

Kivy has proposed what may be considered the most elegant way to account for the relation between a work and its performances. He suggests that the musical work is an eternal type and is realised in its various performances (Kivy, “Platonism in Music”).

This view has been questioned by Levinson, who stresses its inability to account for two rather central intuitions regarding musical works (“What a Musical Work Is” 65–78). First, we would consider two pieces identical in their sound structure but composed at different times to be two different pieces. This intuition is arguably grounded in the different properties we would ascribe to these two pieces (the earlier piece may be ground-breaking, the later one scholastic). Kivy’s view does not respect this intuition, as it identifies musical works with their sound structure and would therefore consider the two pieces to be identical. Second, we consider composers as the creators of the pieces they compose, whereas Kivy’s view holds that composers merely discover pre-existing sound structures.

Jerrold Levinson has suggested his alternative proposal that it is to better accommodate these intuitions. Consider first a non-musical example: the case of the Tarte Tatin. While this type of cake is certainly instantiated by a variety of tokens, it does not serve our intuitions well to hold that this model has always existed in the Platonic realm of eternal forms alongside mathematical entities and the like. The Tarte Tatin is a repeatable entity that was created at some point in time by someone who specified its ingredients, preparation, and so on. The case of musical works may be ontologically akin to the one just presented. We need to make sense of a musical piece as something that has been specified in its sound structure and performance means by some agent at a certain time. Levinson calls this ontological category an indicated type (“What a Musical Work Is” 79). More precisely, a musical work as a sound/performance means structure-as-indicated-by-X-at-t. This characterisation of a piece’s ontological nature is also capable of accounting for the two intuitions mentioned above: the intuition that we should consider as separate works two pieces with an identical sound structure but composed at different times in the history of music and the intuition that musical works are created rather than discovered.

Julian Dodd in 2007 revived the standard Platonist view, according to which musical works pre-exist their composition (“Works of Music”). I focus here on his rejection of Levinson’s arguments. Dodd’s first point concerns Levinson’s claim that a full-fledged Platonist view would fail to make sense of our intuition that composers, in composing a work, engage in a creative process. Dodd takes this objection to conflate the psychological notion of creativity with the metaphysical claim that something is created by composers. While the view that composers are creative is arguably correct, this view is simply expressing the idea that composers are engaging in a creative process, not that they are bringing something into existence. Dodd considers discoveries in the field of mathematics or logic as a useful parallel to the musical case: we do not deny that Pythagoras was a creative individual, even though we may well hold that the theorem that bears his name is an abstract entity that pre-existed its discovery by the Greek mathematician.

Dodd’s second objection to Levinson’s account questions its capacity to strike the intended compromise between the type/token view and our intuition that musical works are created. Indicated types, Dodd observes, are just as problematic as their non-indicated cousins in that they also pre-exist their discovery. Levinson is making the mistake of considering the impossibility of a type’s instantiation as equivalent to the type’s non-existence. But this is metaphysically suspicious, to say the least. As Dodd exemplifies, the type “child born in 1999” could not have been instantiated in the year 1066, yet we would not consider this as a reason to deny its existence in 1066. Ditto for works of music as indicated types. But if indicated types also pre-exist the act of composition, it would seem that we fall back into the idea that musical works are discovered rather than created.

iii. Sceptical Views

In a famous study, Lydia Goehr claimed that the concept of musical work we are familiar with appeared only in the 19th century, as earlier musical practice had looser criteria regarding a piece of music’s identity as well as a more diluted conception of authorship. For instance, scores were less precise in indicating embellishments and performance dynamics—if they did so at all. According to Goehr, this shows how the search for a fundamental ontological category is mistaken when it comes to a culturally variable and historically mutable practice such as music making.

Goehr’s dismissal of musical ontology isn’t typically welcomed by analytic philosophers of music. For instance, Stephen Davies observes that Goehr’s examples show that pieces of music composed prior to 1800 may have had a higher degree of indeterminacy in that they left more choices to the performer. But this falls short of supporting the view that the composers of such pieces were not creating works of music of a sort ontologically akin to those composed later on (Davies, Musical Works 123).

b. Comparative Ontology

Often referred to as “higher-order ontology,” comparative ontology explores alleged differences in the sort of musical works characteristic at the centre of different musical genres. By way of example, I present here two debates concerning the correct ontological characterisation of works in two musical traditions: rock and jazz.

i. Rock

Theodore Gracyk pioneered philosophical reflection on rock music with his monograph Rhythm and Noise (1996). He argues that records are the primary artistic object produced by rock artists as they represent the focus of appreciation for rock fans and critics. While Gracyk is ready to concede that rock musicians also create songs, he denies to songs the central critical place he accords to records. The view he puts forward construes the rock tradition as fundamentally different from the classical one, as in the latter the object of attention is the work as determined by the score, quite apart from its instantiations. In the rock tradition, on the contrary, the particular manifestation of the song found on the relevant recording is the ultimate object of critical attention.

Stephen Davies agrees with Gracyk that works in the rock tradition rely heavily on studio wizardry, but he is unwilling to give up the idea that rock pieces are works for performance. After all, some rock bands only exist as garage bands, playing small venues and never recording their songs, while other major bands play a song live for quite some time before recording an album version. According to Davies, we can make sense of these practices if we describe rock songs as works for studio performance (as opposed to works for live performance, such as the works in the Western classical tradition). What distinguishes works for studio performance is that they are created with the studio as a privileged performance venue, as the studio allows the manipulation of the musical material that, as shown by Gracyk, is so central to the rock tradition (Davies, Musical Works).

In this way, Davies is able to accommodate the important intuition that there could be performances of rock works—rather than just playbacks of the relevant tracks—while still preserving the strength of Gracyk’s claim that studio production plays a paramount role in the sonic identity of rock pieces.

A fundamental difference between this view and Gracyk’s is that Davies intends to stress the continuity between rock and classical music: both traditions produce works for performance, although with different performance contexts in mind.

Christopher Bartel argues that both Gracyk and Davies are mistaken in considering the record as the primary object of appreciation. Grounding his claim on evidence concerning appreciative practices, he argues that several artists in the rock scene are appreciated for their skills as performers or songwriters. As an illustration of this, consider the contrast between the two hard rock bands Led Zeppelin and Deep Purple. These two iconic bands produced their most important records around the same time and played a relatively similar kind of music. However, while the first is appreciated for the polished, layered, and modern character of the tracks they recorded, the latter is considered by fans to be at its finest as a live band. Yet other rock artists are credited for the songs they have written over and above their value as performers or recording artists—Leonard Cohen being a case in point. Bartel concludes that “there are (at least) three practices central to the rock tradition, and musicians will place varying degree of emphasis on each” (153).

ii. Jazz

Various philosophers have examined the ontological peculiarity of jazz, with particular focus on the nature of jazz standards. Here I focus mainly on a debate between Andrew Kania and Julian Dodd.

Kania claims that jazz standards cannot fit the standard token/type ontology that seems apt to describe the relation between a work and its instantiations typical of the Western classical tradition. Jazz, according to Kania, is a workless musical tradition (“All Play and No Work”). While the claim may appear counterintuitive, Kania holds that no available realist view about jazz works could make sense of jazz performance practice. Kania offers three main reasons in favour of his view.

First, he argues that variation in the performance of a standard is too great to identify a core musical material that is common to every performance of that standard. Jazz standards, it would seem, cannot be located in the way works in the classical tradition can. Second, and relatedly, jazz standards do not constrain performance as classical Western works do. Third, Kania claims that jazz standards are not the focus of critical attention. Rather, it is their performances, and their improvisational elements in particular, that are normally subject to the greatest critical scrutiny. This is in stark contrast with what happens in the Western classical tradition, in which the work is the focus of attention.

Against Kania, Dodd holds jazz standards to be ontologically akin to works in the classical tradition. While they may be ontologically thinner, in that performers have more freedom with regard to the piece’s structure, instrumentation, length, and other features, works of jazz are repeatable works, the identities of which are grounded in instructions for performance determined by the composers. Central to Dodd’s rebuttal of Kania’s view is the idea that performance authenticity plays a more peripheral role in jazz than in classical music: we are interested mainly in what musicians do with a standard rather than in their correct performance of it, but this is not to say that the standard is ontologically different from works in the classical tradition (Dodd, “Upholding Standards”).

iii. A Sceptical View

Consider the pluralist view of rock ontology proposed by Bartel in 2017 and summarised above. It is a short step from the acceptance of this sort of pluralism to a full-fledged scepticism with regard to the enterprise of comparative ontology. For if there is no entity that is accorded pride of place when it comes to the appreciation and evaluation of rock music, then perhaps the whole idea of exploring the nature of the rock work rests on the mistaken assumption that there is one such thing. Lee B. Brown has suggested just that. He argues that, both in rock and jazz, what we have are multiple directions of critical and appreciative interest, and no ontological investigation could possibly identify a single ontological category as critically privileged without abandoning a descriptivist approach. He writes: “The truth is that rock history has not depos­ited any well-entrenched concept of the work of rock music” (174).

c. Performance Authenticity

Agreement concerning the fundamental nature of musical works does not imply agreement with regard to how they are correctly instantiated in performances. This section examines answers to the question as to what counts as a correct performance of a piece. In examining this issue, philosophers of music have mainly taken as a point of reference the tradition, starting in the 20th century, of historically informed music performance. Broadly construed, this tradition holds that pieces of music ought to be performed in a way sensitive to the period in which they were composed. While versions of this thesis are widely accepted in musical practice, philosophers and musicologists have debated the justification of such approaches.

As with other issues in the field, Kivy has been one of the earliest and most influential contributors, with a monograph on the topic (Authenticities). His book appeared in the same year as musicologist Richard Taruskin’s seminal Text and Act (1995), and both works share a degree of scepticism regarding the philological reconstruction of the original sound of past music. Particularly, they share the intuition that the self-proclaimed objective, evidence-based treatment of performance and instrumentation choices, is an attempt to remove a central aspect from music-making, at least in the Western classical tradition, that is, the contribution to a piece of the performer’s interpretation of it. Kivy describes this as an unfortunate trade-off between personal authenticity and other kinds of authenticity, particularly sonic authenticity, which is defined as the attempt at replicating the sound of past performances.

Recall that Levinson considers musical works to be structures comprising both sounds and performance means. As a consequence of this view, Levinson holds an instrumentalist position with regard to the instantiation of a work: a work of music is correctly instantiated only if it is performed with the musical instruments (or, more generally, performance means) prescribed by its score.

Stephen Davies distinguishes between ontologically thick and thin works of music (Musical Works). Thin works are comparatively less specific in prescribing performance means and other properties of a correct performance, whereas thicker works leave relatively less freedom to the performer. As an example, popular songs in the American songbook are thinner works than a Mahler symphony. A way to interpret Davies’ suggestion is to consider a compromise between the sonicist and instrumentalist positions just mentioned: there is no absolute standard when it comes to performance requirement, as works in certain traditions prescribe specific instrumentations, whereas other musical practices are more liberal.

Dodd has argued for a pluralist account of performance authenticity, distinguishing between compliance authenticity and interpretive authenticity (“Performing Works of Music”). Whereas the former is concerned with the accurate performance of the piece as specified by the score, the latter is a way of performing that displays a deep understanding of the piece. These two ways of performing may at times be in tension: there are occasions in which disregarding compliance concerns may help a performer produce a persuasive performance of a piece. In these cases, musical practice shows that concerns for interpretive authenticity may override concerns for compliance, as when a piece’s indicated tempo is disregarded by a performer because she deems a different tempo to be more suited to the piece’s character (Dodd, “Performing Works of Music” 9).

A sceptical view of historically informed performance has been expressed by James O. Young. He considers various formulations of the authenticity ideal animating historically informed performances of music and dismisses them as either unattainable or unattractive. Young concludes that contemporary performers engaging in historically informed performance are valuable for their artistic achievements and for their capacity to present the music they play under a new, stimulating light, and not because of their ability to retrieve the “authentic” version of a piece.

A final observation: although I have started by noting how the debate concerning the basic ontological status of musical works does not settle performance authenticity issues, it is worth stressing how the two problems are presumably connected. Historically loaded characterisations of musical work’s fundamental ontological nature are often paired with authenticity requirements that include the means of production of a musical structure (for example instrumentation, as in Levinson’s case), whereas fundamental ontologies of a platonic sort tend to set the bar low, in that parameters such as timbre or instrumentation are irrelevant to the instantiation of a musical work.

4. Musical Understanding

Music isn’t simply sounds we hear. It is sounds we listen to. Analogously to natural languages, the process of listening to music involves understanding it as music. But how exactly should this understanding be characterised? Contemporary analytic philosophy has produced a debate regarding the way in which we should describe basic musical understanding. The intent is to describe the minimum requirements for the appreciative understanding of a musical piece. The two main opposing views, championed by Levinson and Kivy, are termed concatenationism and architectonicism.

a. Concatenationism

According to Levinson, basic musical understanding is defined by our ability to follow the music’s development from one moment to the next (Music in the Moment).

In order to describe this process, Levinson introduces the concept of quasi-hearing. This refers to the process of attentive listening that encompasses the moments immediately preceding the present one, and that, on this basis, anticipates the music’s short-term development.

Basic musical understanding, as characterised by Levinson, does not include a grasp of large-scale structures, such as the exposition-development-recapitulation characteristic of sonata form. While Levinson does not deny that many educated listeners do pay attention to formal musical features, he denies that awareness of these aspects is required in order to satisfactorily understand a piece of music.

In a 2015 defence of his view, Levinson also appeals to empirical research showing how even accomplished musicians are insensitive to significant changes in large-scale structure, as long as they are able to follow the music’s flow from one moment to the next (“Concatenationism” 42).

b. Architectonicism

Against Levinson, Kivy observes that part of the Western classical music canon is impossible to understand without some degree of awareness of large-scale musical structure (“Music in Memory”). Kivy agrees that momentary listening is basic in the sense of being presupposed by any other kind of musical understanding. If one cannot follow the music’s moment-to-moment progress, one cannot understand music at all. But to concede this is not to say that all music may be understood by listeners who follow only the music’s unfolding in the short span covered by quasi-hearing.

A third party in this dispute, Stephen Davies, offers a criticism of Levinson’s view that downplays the difference between concatenationism and architectonicism (“Musical Understandings”, 95–99). He observes that Levinson seems to present momentary listening and structural listening as distinct psychological processes, the former involving perceptual awareness and the latter some sort of cognitive appraisal. But this need not be the case. For instance, our recognition of a theme as it returns after several minutes from its first appearance is perceptual in that it does not involve explicit knowledge regarding the work’s structure, yet it arches back to a part of the piece that clearly lies outside the scope of our quasi-hearing capacities. Accordingly, Davies claims that “Levinson shows not that grasping a work’s overarching form is irrelevant to musical understanding but that such awareness must arise from the listening experience” (“Musical Understandings” 97).

5. Musical Value and Profundity

Is there a value intrinsic to pure instrumental music? For the purpose of this section, I define as an intrinsic value to a work of art w a value that is unavailable to those who do not experience w. This means a work of art’s intrinsic value is not merely instrumental—as is, for instance, the work’s capacity to generate wealth if sold at an auction. While it may be conjectured that representational art-forms possess a value related to their representational content, this move is impossible in the case of pure music, as this lacks by definition any ties to the real world. Where, then, does the value of music reside?

a. Values of Music

But we may be moving too quickly, for it is not beyond dispute that pure music indeed lacks any extra-musical reference. For one thing, as we have seen, many philosophers believe music to be expressive of emotions. (Though they may not agree on whether music may also be about the emotions it expresses—recall that this is a major difference between Kivy’s and Davies’ accounts.) If pure music indeed does have an emotional character, it may be that part of its value as an art form is related to this feature. As an example of this, recall Robinson’s view that a musical piece’s expressive character articulates and individualises an emotion and may allow a listener to feel what it is like to be in that emotional state. If correct, this account offers an elegant insight into the value of expressive music, for it shows that music has the capacity to make us understand what it is to feel an emotion without our having to undergo the full-blown emotion.

Let us now restrict our focus to those who seek to explain the value of music apart from whatever value may ensue from the music’s capacity to be expressive of emotions. This move is necessary because, regardless of how optimistic one may be with regard to the value of musical expressiveness, one will be forced to admit that much great music is lacking in expressive power. Any value possessed by music of this kind would have to be of a sort different from the value connected to the music’s expressive character.

The most promising strategy in order to explain the value of music, despite its lack of any evident connection to our world, is to bite the bullet of abstraction and claim that pure music is valuable precisely because of its abstract nature. This is the strategy pursued by Alan H. Goldman. He observes that it would not be sufficient, in order to establish the peculiar value we accord to music, to point out the ways in which it expresses emotions. For literature and the visual arts surely do so with greater precision, and the scope of emotional states they are able to represent lies outside the possibilities for pure music. The real value of music, according to Goldman, resides in its capacity to fully engage us in the exploration of an alternative world. Goldman fleshes out this proposal by noting that musical tones are experienced as independent from their material sources and constitute a virtual musical space (39–40). Moreover, the development of music is experienced as purposive: the music goes through struggles and developments and finally finds rest—at least in tonal pieces. But this must solve only part of the enigma, for Goldman has so far suggested only that the experience of music is the experience of an alternative world. He hasn’t yet explained why such an experience is valuable to us. His suggestion is that in addition to the capacity music has to allow us to escape our daily concerns regarding the actual world, there is a particularly welcome feature to the alternative world music opens to us. The musical world is designed, and its dissonances and hesitations are finally resolved as the piece comes to an end. The world of music is “a totally human world in which threats are tamed even when tinged with pathos or other negative emotions throughout” (42).

Malcolm Budd also attempts to explain the value of music through its abstract nature. He likens the appeal of music to that of other natural and artefactual objects featuring abstract patterns (165). The peculiarity of music is that these abstract gestalten are offered as developing in time. Moreover, music presents us with formal structures that reach levels of complexity hardly imaginable in other contexts, in that the formal structures of music are hierarchically organized and related to other structures within the same piece—think of an arpeggio as a sonic gestalt, embedded in a chord progression, which we experience as a larger gestalt (168). While we may be confronted with similar levels of formal complexity in the case of logic or mathematics, this abstract complexity is rarely given perceptually, and the formal structures we deal with in those cases do not arguably have as their primary goal the exploration of aesthetically rewarding structures.

Budd also observes how abstractness does not preclude references to the extra-musical altogether: pure music may exemplify relations that are not, qua relations, exclusive to music. (The concept of exemplification used by Budd is the one introduced by Goodman and presented in section 2.c.) Consider the simple case of imitation in a contrapuntal piece. The relation of imitation instantiated by the piece is one that has application outside of the musical domain, and a work of music may exemplify this relation by prominently showcasing it.

b. Profundity

A related debate concerns the sense in which pure music may be described as profound. We routinely say of novels, poems, movies, and even paintings, that they are profound, and mean by this that they convey some sort of insight or give us food for thought. It has been a matter of debate whether pure music could do the same. If it does, then this would arguably constitute a further way in which music may be valuable.

Kivy is sceptical (Music Alone ch. 10). Pure music lacks the minimum requirements for profundity, namely the capacity of denoting something extra-musical, as well as the capacity to communicate profound propositions about that thing.

Kivy had excluded the possibility of musical profundity before others claimed against him that music may indeed be profound. He later likened his early expression of scepticism to the story of the man who told the children not to stick beans up their nose: they would have never had the idea had he not suggested it to them (“Another Go” 410). Much like the children in the story, philosophers of art have tried to do exactly what Kivy said wasn’t advisable, that is, show that music may be profound. What follows is a presentation of two relevant attempts.

Stephen Davies develops a notion of musical profundity that does not commit him to claims about the music’s possession of propositional content. He suggests an analogy with a game of chess ( “Profundity” 348). A cleverly played game or an unexpected and brilliant move may be described as profound because of their capacity to illustrate the impressive potential of human ingenuity and inventiveness. A game of chess is profound not by communicating profound propositions but rather by showing profound analytical skills, problem-solving abilities, and so on. Similarly, music is sometimes profound because it displays a composer’s cleverness in handling the musical material, from the tonal development to the details of the orchestration.

Kivy remains unconvinced by this attempt, and notes that Davies’ criterion for profundity does not seem to reflect the intuitive claim that not all great works of art are profound. If profundity is a display of astonishing ingenuity, then all music masterpieces should be described as profound. That we would refuse to do so counts against Davies’ view of profundity (Kivy, “Another Go” 407).

According to Dodd, Kivy is right in holding music incapable of communicating propositional content, but he is mistaken in considering this a requirement for profundity. In fact, both requirements he sets out for something to be profound are misleading (Dodd, “The Possibility” 301). First, profundity does not require denotation but mere reference, and reference may be achieved in ways other than through denotation. Dodd’s suggestion is that display may be the relevant relation in the musical case. Among the properties it has, a work of music displays those that the sensitive listeners perceive as crucial to the work’s point. In doing so, it may elicit in such a listener a deeper understanding of its subject matter, that is, the displayed properties. According to Dodd, Kivy is misled from the start by his insistence on a quasi-semantic characterisation of profundity, a tendency he undoubtedly owes to his choice to treat literary profundity as paradigmatic (Dodd, “The Possibility” 302).

6. References and Further Reading

  • Bartel, Christopher. “Rock as a Three‐Value Tradition.” The Journal of Aesthetics and Art Criticism, vol. 75, no. 2, 2017, pp. 143–154.
  • Brown, Lee B. “Do Higher-order Music Ontologies Rest on a Mistake?” The British Journal of Aesthetics, vol. 51, no. 2, 2011, pp. 169–184.
  • Budd, Malcolm. Values of Art: Painting, Poetry, and Music. Penguin, 1995.
    • This work, while not uniquely concerned with music, has insightful discussion on both the musical expression of emotions and the value of music as art.
  • Davies, Stephen. “Artistic Expression and the Hard Case of Pure Music.” Contemporary Debates in Aesthetics and the Philosophy of Art, edited by Matthew Kieran, Blackwell, 2006, pp. 179–191.
  • Davies, Stephen. “Emotional Contagion from Music to Listener.” In his Musical Understandings & Other Essays on the Philosophy of Music, Oxford University Press, 2011, pp. 47–65.
    • Davies rejects Robinson’s view of emotional contagion and offers an alternative model.
  • Davies, Stephen. “The Expression of Emotion in Music.” Mind, vol. 89, no. 353, 1980, pp. 67–86.
  • Davies, Stephen. “John Cage’s 4′ 33″: Is It Music?” Australasian Journal of Philosophy, vol. 75, no. 4, 1997, pp. 448–462.
  • Davies, Stephen. “Music and Metaphor.” In his Musical Understandings & Other Essays on the Philosophy of Music, Oxford University Press, 2011, pp. 21–33.
  • Davies, Stephen. Musical Meaning and Expression. Cornell University Press, 1994.
    • A reference work for the debate on musical expressiveness. Like Kivy, Davies defends a resemblance theory of musical expressiveness, although their views do not overlap completely.
  • Davies, Stephen. “Musical Understandings.” In his Musical Understandings & Other Essays on the Philosophy of Music, Oxford University Press, 2011, pp. 88–128.
    • An overview of issues concerning musical understanding.
  • Davies, Stephen. Musical Works and Performances: A Philosophical Exploration. Clarendon Press, 2001.
    • An important work on musical ontology, with a focus on comparative ontology and authenticity.
  • Davies, Stephen. “Profundity in Instrumental Music.” The British Journal of Aesthetics, vol. 42, no. 4, 2002, pp. 343–356.
  • Dodd, Julian. “Musical Works: Ontology and Meta-ontology.” Philosophy Compass, vol. 3, no. 6, 2008, pp. 1113–1134.
  • Dodd, Julian. “Performing Works of Music Authentically.” European Journal of Philosophy, vol. 23, no. 3, 2015, pp. 485–508.
  • Dodd Julian. “The Possibility of Profound Music.” British Journal of Aesthetics, vol. 54, no. 3, 2014, pp. 299–322.
  • Dodd, Julian. “Upholding Standards: A Realist Ontology of Standard Form Jazz.” The Journal of Aesthetics and Art Criticism, vol. 72, no. 3, 2014, pp. 277–290.
  • Dodd, Julian. “What 4’ 33” Is.” Australasian Journal of Philosophy, 2017, doi: 10.1080/00048402.2017.1408664.
  • Dodd, Julian. Works of Music: An Essay in Ontology. Oxford University Press, 2007.
    • Dodd presents a defence of the Platonist view of musical ontology.
  • Dyck, John. “Natural Sounds and Musical Sounds: A Dual Distinction.” The Journal of Aesthetics and Art Criticism, vol. 74, no. 3, 2016, pp. 291–302.
  • Fisher, John Andrew. “The Value of Natural Sounds.” Journal of Aesthetic Education, vol. 33, no. 3, 1999, pp. 26–42.
  • Fisher, John Andrew. “What the Hills Are Alive With: In Defense of the Sounds of Nature.” The Journal of Aesthetics and Art Criticism, vol. 56, no. 2, 1998, pp. 167–179.
  • Goehr, Lydia. The Imaginary Museum of Musical Works: An Essay in the Philosophy of Music. Clarendon Press, 1992.
    • A take on the analytic perspective on musical ontology that is well known even outside philosophical circles.
  • Goldman, Alan. “The Value of Music.” The Journal of Aesthetics and Art Criticism, vol. 50, no. 1, 1992, pp. 35–44.
  • Goodman, Nelson. Languages of Art: An Approach to a Theory of Symbols. Bobbs-Merril, 1968.
    • From an historical perspective, this work is fundamental in setting the stage for future debates concerning musical ontology and expressiveness.
  • Gracyk, Theodore. Rhythm and Noise: An Aesthetics of Rock. Duke University Press, 1996.
    • A seminal work in the discussion of comparative ontology, specifically regarding the ontological status of rock works.
  • Kania, Andrew. “All Play and No Work: An Ontology of Jazz.” The Journal of Aesthetics and Art Criticism, vol. 69, no. 4, 2011, pp. 391–403.
  • Kania, Andrew. “Definition.” The Routledge Companion to Philosophy and Music, edited by Theodore Gracyk and Andrew Kania, Routledge, 2011, pp. 3–13.
    • Regarding the definition of music, this chapter in The Routledge Companion to Philosophy and Music is an excellent starting point. It is also worth noting that the entire Companion offers an excellent and up-to-date overview of most topics in the analytic philosophy of music.
  • Kingsbury, Justine. “Matravers on Musical Expressiveness.” The British Journal of Aesthetics, vol. 42, no. 1, 2002, pp. 13–19.
  • Kivy, Peter. “Another Go at Musical Profundity: Stephen Davies and the Game of Chess.” The British Journal of Aesthetics, vol. 43, no. 4, 2003, pp. 401–411.
  • Kivy, Peter. Authenticities. Philosophical Reflections on Musical Performance. Cornell University Press, 1995.
  • Kivy, Peter. The Corded Shell: Reflections on Musical Expression. Princeton University Press, 1980.
    • A reference work for the debate on musical expressiveness, defending a resemblance theory of musical expressiveness similar to Davies’, although their views do not overlap completely.
  • Kivy, Peter. Music Alone: Philosophical Reflections on the Purely Musical Experience. Cornell University Press, 1991.
    • Kivy discusses a variety of issues concerning musical value and profundity and defends a formalist view of the appreciation of Western classical music.
  • Kivy, Peter. “Music in Memory and Music in the Moment.” In his New Essays on Musical Understanding, Oxford University Press, 2001, pp. 183–217.
  • Kivy, Peter. “Platonism in Music: A Kind of Defense.” Grazer Philosophische Studien, 19, 1983, pp. 109–129.
  • Levinson, Jerrold. “Concatenationism, Architectonicism, and the Appreciation of Music.” In his Musical Concerns: Essays in Philosophy of Music. Oxford University Press, 2015, pp. 32–44.
  • Levinson, Jerrold. “The Concept of Music.” In his Music, Art, and Metaphysics: Essays in Philosophical Aesthetics, Cornell University Press, 1990, pp. 267–278.
  • Levinson, Jerrold. “Musical Expressiveness as Hearability-as-expression.” Contemporary Debates in Aesthetics and the Philosophy of Art, edited by Matthew Kieran, Blackwell, 2006, pp. 192–204.
    • A clear formulation of the persona theory of musical expressiveness.
  • Levinson, Jerrold. “Music and Negative Emotions.” In his Music, Art, and Metaphysics: Essays in Philosophical Aesthetics, Cornell University Press, 1990, pp. 306–335.
  • Levinson, Jerrold. Music in the Moment, Cornell University Press, 1997.
    • Levinson offers the first formulation and defence of the concatenationist view.
  • Levinson, Jerrold. “What a Musical Work Is.” In his Music, Art, and Metaphysics: Essays in Philosophical Aesthetics, Cornell University Press, 1990, pp. 63–88.
    • An important work on musical ontology.
  • Matravers, Derek. Art and Emotion. Oxford University Press, 1998.
  • Nussbaum, Charles O. The Musical Representation: Meaning, Ontology, and Emotion. MIT Press, 2007.
  • Robinson, Jenefer. Deeper than Reason: Emotion and its Role in Literature, Music, and Art. Oxford University Press, 2005.
    • An ambitious and empirically informed study on emotional expression in the arts. The section on music defends a hybrid view that combines arousalist elements with Levinson’s persona theory. This work also presents Robinson’s model of emotional contagion from music to listener.
  • Scruton, Roger. The Aesthetics of Music. Oxford University Press, 1997.
    • A highly original and influential take on many of the issues discussed in this article, from definitional concerns to problems of expressiveness and value.
  • Taruskin, Richard. Text and Act: Essays on Music and Performance. Oxford University Press, 1995.
  • Tormey, Alan. The Concept of Expression: A Study in Philosophical Psychology and Aesthetics. Princeton University Press, 1971.
  • Trivedi, Saam. Imagination, Music, and the Emotions: A Philosophical Study. State University of New York Press, 2017.
  • Young, James O. “The Concept of Authentic Performance.” The British Journal of Aesthetics, vol. 28, no. 3, 1988, pp. 228–238.

 

Author Information

Matteo Ravasio
Email: mrav740@aucklanduni.ac.nz
University of Auckland
New Zealand

Bertrand Russell: Logic

Bertrand RussellFor Russell, Aristotelian syllogistic inference does not do justice to the subject of logic. This is surely not surprising. It may well be something of a surprise, however, to learn that in Russell’s view neither Boolean algebra nor modern quantification theory do justice to the subject. For Russell, logic is a synthetic a priori science studying all the kinds of structures there. This thesis about logic makes up the lion’s share of Russell’s philosophy of logic until the late 1920’s, and we shall have little to say of his flirtations with the naturalization of mind thereafter. We shall have much to say about his views on the ontology of structures, for they underwent extensive changes in the time from his writing The Principles of Mathematics (1903) to the three of the four projected volumes of Principia Mathematica (1910, 1912, 1913) coauthored with Alfred North Whitehead. The fourth volume on geometry never appeared. Much of this article’s presentation of Russell’s Logic will concern Russell’s various logical systems as they pertain to his Logicism. In “Mathematics and the Metaphysicians” (1901), Russell’s heralds his logicist thesis, observing that mathematics has enjoyed a conceptual revolution. One of the chief triumphs of modern mathematics, he explains, consists in having discovered that mathematics studies relational structure and is therefore free of commitment to the metaphysicians’ abstract particulars such as numbers and spatial figures. This revolutionary conception of mathematics was made possible by advances in geometry, especially in non-Euclidean geometry, and advances in analysis, where real numbers, limits and continuity were newly defined by thinkers such as Cantor, Dedekind, and Weierstrass. On the new conception or mathematics, it is relational order, not magnitude that is the focus. Meanwhile, Logic was also enjoying a conceptual revolution due to Gottlob Frege, who maintained that with the impredicative comprehension of functions, logic (that is, comprehension principle logic, ‘cp-Logic’ hereafter) is an informative science. Russell took this new science to be a study of relational structure, conducted by studying relations independently of whether they are exemplified. The branches of mathematics, in Russell’s view, are studies of different sorts of relations, which structure their fields. Mathematics, then, is a branch of the cp-Logic of relations.

Table of Contents

  1. Russell’s Logicism
  2. The Simple Type Syntax of PrincipiaL
  3. Developments: Principia Mathematica’s Section 8, Definite Descriptions, Class Expressions
  4. The Ramified-Type Syntax of Church’s PrincipiaC
  5. The Quantification Theory of Propositions in Theory of Implication (c. 1905)
  6. The Substitutional Theory of Propositions (1905)
  7. The Substitutional Theory Without General Propositions (1906)
  8. Church’s PrincipiaC and Russell’s Orders of Propositions (c. 1907)
  9. Appendix A: Quantification Theory in The Principles of Mathematics
  10. Appendix B: The 1925 experiment of Principia MathematicaW
  11. References and Further Reading
    1. Works by Russell
    2. Books and Articles

1. Russell’s Logicism

In The Art of Philosophizing, Bertrand Russell offered the following admonishment:

If you wish to become a logician, there is one piece of advice which I cannot urge too strongly, and that is: Do Not learn the traditional formal logic. In Aristotle’s day it was a creditable effort, but so was the Ptolemaic astronomy. To teach either in the present day is a ridiculous piece of antiquarianism.

Russell’s own logics are tailored to his Logicism. Care must be taken in using the word ‘Logicism,’ however, since its advocates have had quite different agendas and quite different conceptions of what it entails. Carnap’s characterization presents Logicism as wedded to a deductive thesis according to which all the truths of mathematics can be derived as theorems from a consistent axiomatic foundation that captures all and only logical truths. This use of ‘Logicism’ can lead to confusion. This form of Logicism belongs neither to Frege’s nor to Russell’s conception. Though both held that a system of cp-Logic is consistently recursively axiomatizable, neither made it definitive of Logicism. Gödel showed that a consistent axiomatic calculus adequate to represent every recursive natural number theoretic function is negation incomplete. That is, for each such calculus there is a wff (well-formed formula) G such that neither G nor ~G is a theorem. Since either G or ~G is true in the standard model, the consistent axiomatic system must leave out a truth of arithmetic. But this is also irrelevant to Logicism as Frege and Russell understood it.

Let us put forth the following definition that altogether separates the deductive thesis from the Logicist thesis. Russell’s Logicism is expressed by this definition:

RLogicism =df pure mathematics is a branch of cp-Logic.

Russell’s Logicism is the thesis that all branches of mathematics, including geometry, Euclidean or otherwise, are studies of relational structures and therefore are studies that can be subsumed within the cp-Logic of relations. Cp-Logic is not modern quantification theory with identity. It is a quantification theory that enables the binding of predicate variables as well as individual variables and which embraces the impredicative comprehension of relations independently of whether these relations are exemplified. Its impredicativity indicates that no restrictions are to be placed on the quantifiers occurring in the wffs which give the exemplification conditions for comprehension of universals.

Two important revolutions, one due to Cantor and another due to Frege, are behind Russell’s Logicism and it would inconceivable without them. Henri Poincaré, a prominent mathematician, never could embrace the revolutions. Poincaré thought of logic and mathematics in the old ways, with mathematics about metaphysical abstract particulars, numbers and spatial figures, and logic constrained to proper inference in reason—a theory of a deductive consequence relation. Poincaré thought Russell’s Logicism entailed that mathematicians are to change their creative practices and tailor proofs techniques into the p’s and q’s of a canonical logistic. Russell’s Logicism entails no such transformation. It simply maintains that, as a study of relational structures, mathematics is a part of cp-Logic as the synthetic a priori science studying all the kinds of relational structures there are by studying the way relations, exemplified or not, order their fields. This is not a movement coming outside of mathematics. It comes from within. It implies that mathematicians are doing cp-Logic—that is, studying relation structures—when they do mathematics.

In Russell’s view, Cantor’s revolution, together with such figures as Weierstrass, Dedekind and Pieri, was responsible for inaugurating the transformation of all branches of mathematics into studies of kinds of relational structures. Russell’s agenda was to demonstrate that abstract particulars are nowhere needed in any branch of mathematics. Frege’s revolution was no less central to Russell’s unique Logicism. It was responsible for transforming the field of logic into cp-Logic, which, as Frege saw it, embraces the informative impredicative comprehension of functions. It was precisely this imprediative comprehension that enabled his new cp-Logic to be an informative science capable of capturing the notions of the ancestral and cardinal number, and to arrive at a theorem of mathematical induction. Frege had seen this already in his Begriffsschrift (1879). Russell came to appreciate it slowly. Frege never quite embraced what Russell regarded as the Cantorian revolution and certainly did not have the Russellian agenda of eliminating abstract particulars—not from geometry and certainly not from the arithmetic of numbers (cardinal, natural, and so on). Quite to the contrary, Frege was adamant in maintaining that cardinal numbers are objects.

In The Principles of Mathematics (1903) Russell’s aim is to explain his Logicism.

The Principles of Mathematics operates with an ontology of logically necessary abstract particulars that are called ‘propositions’. They are mind and language independent entities some of which have the unanalyzable property of being true while others are false. The work was to have a second volume which worked out in a technically formal symbolic way the doctrines of the first volume. The second volume would also solve paradoxes such as Cantor’s paradox of the greatest cardinal, the Burali-Forti paradox of the greatest ordinal, and Russell’s paradoxes of classes and attributes (The Principles of Mathematics , p. xvi). The second volume was to have been coauthored with Alfred North Whitehead who had been a long-time mentor of Russell in mathematics and whose work on abstract algebra is a natural ally of the logicist agenda. But, the project was abandoned.

Instead, Whitehead and Russell produced Principia Mathematica. The Preface goes so far as to say that the work of Principia Mathematica had begun in 1900, even prior to the publication of The Principles of Mathematics. It explains that instead of a second volume for The Principles of Mathematics couched in an ontology of logical necessary existing propositions, the work offers a fresh start avoiding abstract particulars not only in all the branches of mathematics but avoiding them in the field of cp-Logic itself (Principia Mathematica, p. v). Ultimately, Russell went on to endeavor to eliminate abstract particulars from philosophy altogether. This is the agenda of his book Our Knowledge of the External World as a Field for Scientific Method in Philosophy (1914) which offered a research program that made Principia Mathematica’s cp-Logic the essence of philosophy. The program, Russell thought, held promise for solving all philosophical problems—problems arising from the paucity of imagination among speculative metaphysicians that results in an inadequate logic that produces indispensability arguments for abstract particulars and kinds of non-logical necessity governing them.

Though Russell’s transition from The Principles of Mathematics to Principia Mathematica is quite complicated, the logicist thesis of the former has not changed at all in the latter. Ample evidence can be found in Principia Mathematica in the following:

Section A: The theory of Deduction (p. 90).

Summary of Principia’s Part I (p. 87).

Summary of Part II, Section A: Prolegomenon to Cardinal Arithmetic (p. 329).

Principia Mathematica says, for example, that the subject of cardinal arithmetic is regarded as different only in degree from the subject matter of logic discussed in Part I. Principia Mathematica is surely advocating Logicism just as in The Principles of Mathematics, but some quite striking changes occur between the two works. For example, in Principia Mathematica Whitehead and Russell no longer regard the infinity of natural numbers to be a subject for mathematics to decide. This result so surprised Boolos (1994) that he concluded that work no longer advances Logicism. But quite to the contrary, it stems from the same source as the discovery in non-Euclidean geometry that not all right triangles obey the Pythagorean theorem. The agenda is to reject indispensability arguments for abstract particulars; the results follow from there. Similarly, that the infinity of the natural numbers is not a mathematical issue follows from the rejection of classes or sets as abstract particulars. There are many such surprises in Principia Mathematica. Another is the discovery that Hume’s Principle, which asserts that the cardinals of two classes are the same if and only if the classes are similar, admits of exceptions (see Landini 2016). Though the conception of Logicism has not changed, it is easy to see that quite a lot happened in the interim between The Principles of Mathematics and Principia Mathematica.

For a great many years the interim period has been akin to the dark ages whose role in modern science has only recently come to light. In this period, Russell worked steadfastly to emulate the impredicative comprehension of cp-Logic in an ingenious substitutional logic of propositional structure. The foundations of the idea to find a substitutional theory to emulate a simple type of universals (and thereby classes) is already manifest in Appendix B of The Principles of Mathematics itself. But, it used the substitution of denoting concepts (‘all a’, ‘some a’, ‘the a’, ‘an a’ ‘any a’ and ‘every a’). The theory of denoting concepts of The Principles of Mathematics proved to be a quagmire and without the 1905 theory of definite descriptions, Russell could not execute the plan for a substitutional theory (see, Landini 1998b). The substitutional theory finally became viable in 1905 and it pervaded Russell’s work until 1908, but most of it was almost completely unknown until the 1980’s. Happily, much of Russell’s work during this time has become clear such that we can better understand the evolution of Principia Mathematica and Russell’s apparently sudden abandonment of propositions. Contrary to years of misunderstanding, the evolution of Russell’s mathematical logic toward Principia Mathematica was not driven by a misguided interest in finding a common solution of both logical and semantic paradoxes. What ended Russell’s substitutional theory of propositional structure was not problems of unity, not problems concerning Liar paradoxes of propositions, and certainly not semantic paradoxes of naming or denoting or defining characteristic of the Richard paradox or Berry or the Grelling. What ended the substitutional theory was a paradox, here called Russell’s ‘ /  paradox’. Unlike Liars and semantics paradoxes, it is a Cantorian diagonal paradox grounded in the fact that the emulation of simple types of attributes in the substitutional theory is inconsistent with Cantor’s power-theorem that assures that there can be no function from objects (propositions being themselves objects) onto properties of those objects.

In summary, the whole of Russell philosophical work in mathematical logic may be seen in terms of his trials and tribulations at emulating an impredicative simple-type regimented cp-Logic of universals. Our focus, therefore, is squarely on the evolution of the cp-Logic of Principia Mathematica.  In what follows, we shall outline the major logical systems that led Whitehead and Russell to Principia Mathematica’s syntax and formal theory and the informal semantic interpretation they gave it. Since Russell’s work toward a substitutional theory in The Principles of Mathematicss ended in a quagmire and did not yield a formal system, we shall not pause to discuss it. The basic quantification theory of The Principles of Mathematics was replaced by the 1905 “Theory of Implication” which formed the quantification theory for logic of substitution which was to appear in Whitehead and Russell second volume of The Principles of Mathematics.

When Russell abandoned the propositions of his substitutional theory, he abandoned the idea of a second volume for The Principles of Mathematics. But he did not abandon hope that an emulation of an impredicative simple-type stratified regimentation of the cp-Logic of universals might still be found. In the introduction to the first edition of Principia Mathematica, Whitehead and Russell propose an informal nominalistic semantic interpretation of the object-language bindable predicate variables. But by 1920, Russell had come to realize that such a nominalistic semantics could not validate impredicative comprehension axioms. Only a Realist semantics can validate the comprehension principles Principia Mathematica’s impredicative simple-type regimented cp-Logic. Russell never stopped trying, however. In its 1925 second edition, Russell experimented with Wittgensteinian ideas for emulating impredicative comprehension, imagining an altered grammar to accommodate extensionality. Whitehead was not happy with this experiment being included in the new edition since neither he nor Russell intended to advocate it. Alas, Whitehead was right (see, for example, Lowe 1990, Monk 1996). The ideas of the 1925 second edition are sketched in an appendix and end our discussion of Russell’s logics.

2. The Simple Type Syntax of PrincipiaL

pdf

3. Developments: Principia Mathematica’s Section 8, Definite Descriptions, Class Expressions

pdf

4. The Ramified-Type Syntax of Church’s PrincipiaC

pdf

5. The Quantification Theory of Propositions in Theory of Implication (c. 1905)

pdf

6. The Substitutional Theory of Propositions (1905)

pdf

7. The Substitutional Theory Without General Propositions (1906)

pdf

8. Church’s PrincipiaC and Russell’s Orders of Propositions (c. 1907)

pdf

9. Appendix A: Quantification Theory in The Principles of Mathematics

pdf

10. Appendix B: The 1925 experiment of Principia MathematicaW

pdf

11. References and Further Reading

a. Works by Russell

  • The Collected Papers of Bertrand Russell, Vol. 4, Foundations of Logic: 1903-1905, ed. by Alsdair Urquhard (London: Routledge, 1994).
  • The Collected Papers of Bertrand Russell, Vol. 6, Logic and Philosophy Papers: 1901-1913, ed. John G. Slater (London, Routledge, 1992).
  • The Principles of Mathematics, (PoM) second-edition (New York: W.W. Norton & Co., second edition 1937, 1964). First edition (London: Allen & Unwin, 1903).
  • “On Denoting,” in Essays in Analysis, pp. 103-119.First published in Mind 14 (1905), pp.  479-493.
  • “On Fundamentals,” Collected Papers Vol. 4, pp. 359-413.
  • “On The Logic of Relations,” in Logic and Knowledge Essays, pp. 3-38. First published as “Sur la logique des relations,” Rivista di Mathematica, Vol. vii, (1901), pp. 115-148.
  • “On the Relation of Mathematics to Logic,” in Essays in Analysis, pp. 260-271. First published as “Sur la Relation des Mathématiques B la Logistique,” in Revue de Métaphysique et de Morale 13, (1905) pp. 906-917.
  • “On Some Difficulties in the Theory of Transfinite Numbers and Order Types,” in Essays in Analysis, pp. 135-164. First published in Proceedings of the London Mathematical Society 4 (March 1906), pp. 29-53.
  • “On the Substitutional Theory of Classes and Relations,” in Essays in Analysis, pp. 165-189. Manuscript received by the London Mathematical Society on 24 April 1905.
  • “On ‘Insolubila’ and Their Solution By Symbolic Logic,” in Essays in Analysis, pp. 190-214. First published as “Les Paradoxes de la Logique,” Revue de Métaphysique et de Morale, 14 (1906) pp. 627-50.
  • “Mathematical Logic as Based on the Theory of Types,” in Logic and Knowledge, pp. 59-102. First published in The American Journal of Mathematics 30 (1908), pp. 222-62.
  • Philosophy (New York: W. W. Norton & Co., 1927).
  • Principia Mathematica (coauthored by A. N Whitehead), second edition (Cambridge, 1925, 1962); First edition, Cambridge, Vol. 1 (1910), Vol. 2 (1911), Vol. 3 (1913).
  • Principia Mathematica to *56 (Cambridge, 1964).
  • Introduction to Mathematical Philosophy (London: Allen & Unwin, 1919, 1953).
  • My Philosophical Development (New York: Simon & Schuster, 1959).
  • The Art of Philosophizing (New York: Philosophical Library, 1968).

b. Books and Articles

  • Blackwell, Kenneth. “The Early Wittgenstein and the Middle Russell,” in Irving Block ed., Perspectives on the Philosophy of Wittgenstein (Cambridge, MIT Press, 1981), p. 27, fn. 3.
  • Boolos, George. 1994 “The Advantages of Honest Toil over Theft,” in Alexander George, ed., Mathematics and Mind (Oxford: Oxford University Press), pp. 27-44.
  • Church, Alonzo. (1956).Introduction to Mathematical Logic (New Jersey: Princeton University Press).
  • Church, Alonzo. 1976 “Comparison of Russell’s Resolution of the Semantical Antinomies with that of Tarski,” Journal of Symbolic Logic 41, pp. 747-760.
  • Church, Alonzo. (1984) “Russell’s Theory of the Identity of Propositions,” Philosophia Naturalis 21, pp. 513-22.
  • Cocchiarella, Nino. 1987 “The Development of the Theory of Logical Types and the Notion of a Logical Subject in Russell’s Early Philosophy,” Synthese 45 (1980), pp. 71-115. Reprinted in Logical Studies in Early Analytic Philosophy (Columbus: Ohio State University Press), pp.19-63.
  • Cocchiarella, Nino. 1987 “Logical Atomism and Modal Logic,” in Logical Studies in Early Analytic Philosophy (Columbus: Ohio State University Press), pp. 222-243.
  • Cocchiarella, Nino. 1987 “Logical Atomism, Nominalism and Modal Logic,” Philosophia, Philosophical Quarterly of Israel 4, (1974), pp. 41-44. Reprinted in Logical Studies in Early Analytic Philosophy (Columbus: Ohio State University Press), pp. 244-284.
  • Cocchiarella, Nino. 1987 “Russell’s Theory of Logical Types and the Atomistic Hierarchy of Sentences,” in Nino Cocchiarella, Logical Studies in Early Analytic Philosophy, (Columbus, Ohio State University Press, ), pp. 193-221.
  • Copi, Irving. 1971 The Theory of Logical Types (London: Routledge & Kegan Paul).
  • Frege, Gottlob. 1884 The Foundations of Arithmetic, translated by J. L. Austin (Northwestern University Press, 1980). First published as Die Grundlagen der Arithmetik: eine Logisch-Mathematische Untersuchung hber den Begriff der Zahl, (Breslau, 1884).
  • Frege, Gottlob. 1893 Grundgesetze der Arithmetik, Vol. I (Jena, 1893), Vol. II (Jena 1903) Reprinted by Darmstadt Hildesheim: Georg Olms Verlag, 1962).
  • Frege, Gottlob. 1892 “On Concept and Object,” in eds., Peter Geach and Max, Black, Translations from the Philosophical Writings of Gottlob Frege, (Oxford: Basil Blackwell, 1977), pp. 21-41. First published as Über Begriff und Gegenstand ” in Vierteljarsschrift fhr wissenschaftliche Philosophie, vol. XIV 1892, pp. 192-205.
  • Frege, Gottlob. 1980 Philosophical and Mathematical Correspondence, edited by Gottfried Gabriel, Hans Hermes, Friedrich Kambartel, Christian Thiel, Albert Verrart, and abridged from the German edition by Brian McGuinness and translated by Hans Kaal (Chicago: University Press).
  • Frege, Gottlob. 1964 The Basic Laws of Arithmetic: Exposition of the System, translated with an editor’s introduction by Montgomery Furth (Berkeley: University of California Press).
  • Galaugher, Jolen. 2013 “Substitution’s Unsolved Insolubilia,” Russell 3, pp. 5-30.
  • Geach, P. T. 1956 “Frege’s Way Out” Mind 65, pp. 408-409.
  • Gödel, Kurt. 1944 “Russell’s Mathematical Logic,” in ed., Paul Arthur Schilpp, The Philosophy of Bertrand Russell (Evanston: Northwestern University Press), 125-153.
  • Grattan-Guinness, Ivor. 1977 Dear Russell- Dear Jourdain (London: Duckworth).
  • Grattan-Guinness, Ivor. 2001 In Search for Mathematical Roots 1870-1940: Logic, Set Theories and the Foundations of Mathematics from Cantor Through Russell to Gödel (Princeton University Press).
  • Griffin, Nicholas. 1981 “Russell on the Nature of Logic” Synthese 45, pp. 117-188.
  • Griffin, Nicholas. ed., 2003 The Cambridge Companion to Bertrand Russell (Cambridge University Press).
  • Hatcher, William. 1982 The Logical Foundaions of Mathematics (Oxford: Pergamon Press).
  • Hazen, Allen. 2004 “A ‘Constructive’ Proper Extension of Ramified Type Theory; The Logic of Principia Mathematica, Second Edition, Appendix B,” in ed., Godehard Link, One Hundred Years of Russell’s Paradox (Berlin: Walter de Gruyter), pp. 449-480.
  • Holroyd Michael. 1967 Lytton Strachey (London: Heinemann)
  • Landini, Gregory. 1996 “The Definability of the Set of Natural Numbers in the 1925 Principia Mathematica,” Journal of Philosophical Logic 25, pp. 597-615.
  • Landini, Gregory. 1998a Russell’s Hidden Substitutional Theory (New York: Oxford University Press).
  • Landini, Gregory. 1998b “On Denoting Against Denoting,” Russell 18, pp. 43-80.
  • Landini, Gregory. 2000 “Quantification Theory in *9 of Principia Mathematica,” History and Philosophy of Logic 21, pp. 57-78.
  • Landini, Gregory. 2004a Logicism’s ‘Insolubilia’ and Their Solution By Russell’s Substitutional Theory,” in ed., Godehard Link, One Hundred Years of Russell’s Paradox (New York: De Gruyter), 373-399.
  • Landini, Gregory. 2004b “Russell’s Separation of the Logical and Semantic Paradoxes,” in Philippe de Rouilhan, ed., Russell en héritage, (Revue Internationale Philosophie 3, pp. 257-294.
  • Landini, Gregory. 2005 “Quantification Theory in *8 of Principia Mathematica and the Empty Domain,” History and Philosophy of Logic, 25, pp. 47-59.
  • Landini, Gregory. 2007 Wittgenstein’s Apprenticeship With Russell (Cambridge: Cambridge University Press).
  • Landini, Gregory. (2013a) “Zermelo and Russell’s Paradox: Is there a Universal set?” Philosophica Mathematica, vol. 21, pp. 180-199.
  • Landini, Gregory. (2013b) “Review of Bernard Linsky, The Evolution of Prinicpia Mathematica: Bertrand Russell’s Manuscripts and Notes fo the Second Edition,” History and Philosophy of Logic 34, pp. 79-97.
  • Landini, Gregory. 2016 “Whitehead’s Badly Emended Principia,” History and Philosophy of Logic 37, pp. 1-56.
  • Linsky, Bernard. 1999 Russell’s Metaphysical Logic (Stanford: CSLI Publications).
  • Lowe, Victor. 1990 Alfred North Whitehead: The Man and His Work. Volume II: 190-1947 edited by J. B. Schneewind (Baltimore: Johns Hopkins University Press).
  • Monk, Ray. Bertrand Russell: The Ghost of Madness 1921-1970 (The Free Press, 2001).
  • Myhill, John. 1974 “The Undefinability of the Set of Natural Numbers in the Ramified Principia,” in George Nakhnikian, ed., Bertrand Russell’s Philosophy (New York: Barnes & Noble), pp. 19-27.
  • Quine, W.V.O. 1954 “Quantification and the Empty Domain,” Journal of Symbolic Logic 19, pp. 177-179.
  • Quine, W.V.O. “Frege’s Way Out” Mind 64 (1955), pp. 145-159.
  • Quine, W.V.O. 1980 Set Theory and Its Logic (Cambridge: Harvard University Press).
  • Ramsey, Frank.1925 “The Foundations of Mathematics,” in ed., R. B. Braithwaite ed., The Foundations of Mathematics and Other Essays by Frank Plumpton Ramsey (Harcourt, Brace and Co., 1931), pp. 1-61. First published in the Proceedings of the London Mathematical Society, 25 (1925), pp. 338-84.
  • Rouilhan (de), Philippe. 1996 Russell et le cercle des paradoxes (Paris: Presses Universitaries de France), p. 275.
  • Schmid, Anne-Françoise. 2001 ed., with commentary, Bertrand Russell: Correspondence sur la Philosophie, la Logique et la Politique avec Louis Couturat 1897-1913 ( Paris: édition Kimé, volume I, II).
  • Van Heijenoort, Jean. 1967 “Logic as Calculus and Logic as Language,” Synthese 17, pp. 324-30.
  • Whitehead, A. N. 1911 An Introduction to Mathematics (London: Williams and Norgate).
  • Wittgenstein, Ludwig. 1914 Notebooks 1914-1916, ed. by G. H. Von Wight and G. E. M. Anscombe, (Chicago: University of Chicago Press, 1979).
  • Wolfe, Mays. 1967 “Recollections of Wittgenstein,” in K. T. Fann (ed), Ludwig Wittgenstein: The Man and His Philosophy (New Jersey).

 

Author Information

Gregory Landini
Email: gregory-landini@uiowa.edu
University of Iowa
U. S. A.

Natural Kinds

A large part of our exploration of the world consists in categorizing or classifying the objects and processes we encounter, both in scientific and everyday contexts. There are various, perhaps innumerable, ways to sort objects into different kinds or categories, but it is commonly assumed that, among the countless possible types of classifications, one group is privileged. Philosophy refers to such categories as natural kinds. Standard examples of such kinds include fundamental physical particles, chemical elements, and biological species. The term natural does not imply that natural kinds ought to categorize only naturally occurring stuff or objects. Candidates for natural kinds can include man-made substances, such as synthetic elements, that can be created in a laboratory. The naturalness in question is not the naturalness of the entities being classified, but that of the groupings themselves. Groupings that are artificial or arbitrary are not natural; they are invented or imposed on nature.  Natural kinds, on the other hand, are not invented, and many assume that scientific investigations should discover them.

To say that a kind is natural, rather than artificial or arbitrary, means, minimally, that it reflects some relevant aspects of the world and not only the interests of, or facts about, the classifiers. The expression “footwear under $100,” for instance, describes an artificial kind reflecting some categorizer’s interest—their budget—and not some relevant feature of the classified objects.

Another feature of natural kinds is that they allow many important inferences about the entities grouped within them. Take gold: All entities classified as gold share a property—their atomic structure—that uniquely identifies a chemical element. This property also accounts for gold’s other observed properties, such as its color, malleability, and so forth. Identifying something as gold warrants many inferences and generalizations, such as that it dissolves in mercury at room temperature and is unaffected by most acids, that will apply to all samples of gold.

More problematic, but still debated as possible instances of natural kinds, are categories in higher-level sciences: psychological categories, such as emotion; psychiatric conditions, such as depression; and social categories, such as money. We might not be able to identify anything like the atomic structure of a chemical element for depression. However, one might still wonder whether people suffering from it share properties that account for their behaviors and help us explain the condition’s causes and how it might be treated. Few people, perhaps, will consider most higher-level categories, such as psychiatric conditions, to be candidates for natural kinds. Nonetheless, what makes depression a legitimate scientific category, unlike hysteria, remains to be examined.

This article describes the three most prominent accounts of natural kinds: essentialism, cluster kinds, and promiscuous realism. It spells out some of the features standardly associated with natural kinds and then examines the three views on natural kinds via specific examples of candidates for natural kinds in chemistry, biology and psychiatry. The final section discusses the metaphysics of natural kinds and offers a systematization of the possible views.

Table of Contents

  1. What Makes a Kind Natural?
    1. Natural Kind Monism vs. Pluralism
    2. How to Identify Natural Kinds: Their Role in Inductive Generalizations, Scientific Laws, and Explanations
    3. Natural Kinds and Functional Kinds
    4. The Increase of Interest in Natural Kinds in the Twentieth Century
  2. Three Views on Natural Kinds: Essentialism, Cluster Kinds, and Promiscuous Realism
    1. Essentialism: The Case of Chemical Elements
    2. Natural Kinds as Property Clusters: The Case of Biological Species
    3. Promiscuous Realism: The Case of Psychiatric Categories
  3. Metaphysics of Natural Kinds
    1. What Does It Mean that a Kind is Real?
    2. The Relationship Between Scientific Realism and Natural Kinds Realism
    3. Natural Kinds Realism
    4. Natural Kinds Antirealism
  4. Conclusion
  5. References and Further Reading

1. What Makes a Kind Natural?

The philosophical tradition has long demanded that we ought to search for natural classifications in our investigation of the world. The nature of this demand can be difficult to spell out. This idea is often illustrated with Plato’s famous metaphor about “carving nature at its joints.” In Phaedrus, he says that we should “divide into forms, following the objective articulation; we are not to attempt to hack off parts like a clumsy butcher” (Plato 1952, 265e).  The underlying intuition here is that the natural world is divisible into objective categories and that we should strive to discover such divisions. That is, our exploration of the world should model itself on the practice of a competent butcher who, when cutting the meat, follows its natural divisions and does not clumsily hack parts off.

Questions arise as to how we identify suitable candidates for such “natural openings” and where we should draw divisions between objects in the world. One good place to look for them would be in the discipline of particle physics because it appears that, if there are some objective divisions in nature, they will surely be found at the level of fundamental entities that comprise all existing things: protons, neutrons, electrons, or even smaller particles like quarks. That kind of reasoning was already present in ancient Greece, where attempts were made at discovering the true nature of all things, whether it was elements that everything else is composed of, like water or fire, or whether it required finding the smallest indivisible building blocks of matter, like atoms. In this respect, contemporary scientific research might be seen as a continuation of the same project.

Alternatively, one might argue that the approach of finding the most basic constituents of matter is too restrictive and that there are many other objective categories to be discovered. In geology, for instance, different rocks can be divided according to their qualities—mineral and chemical composition, permeability, texture of the constituent particles, particle size—and these can be taken as objective parameters for classification. Moreover, some authors make a case that there are natural kinds in the higher-level or special sciences such as biology, psychology, or linguistics (Fodor 1974). It could be argued, for example, that certain basic emotions, such as fear and anger, are identified and recognized across different cultures, which makes them suitable candidates for natural kinds. Similar reasoning might be applied to nonnatural or artificial entities, including cultural artifacts, such as language. The fact that certain linguistic patterns occur systematically across all natural languages may indicate that groupings of such patterns represent objective linguistic categories.

a. Natural Kind Monism vs. Pluralism

Cross-cultural convergence in classification, as in the example above of common linguistic patterns, can be interpreted in two ways. One is to say that it indicates the existence of objective categories that rational investigators will eventually discover. The other is the notion that we group things in such a way because our cognitive makeup makes those groupings especially salient to us. In this case, the grouping would not only reflect the objective structure of the world, but also our cognitive dispositions. This issue is examined in the section entitled “Metaphysics of Natural Kinds.”

In many cases, however, the classification systems are not shared, but rather vary cross-culturally, or across different scientific disciplines. Facing such situations, one might wonder whether there is one correct system or whether different ones can be equally valid. Going back to Plato’s metaphor, if different butchering traditions produced meat that is carved up differently, so that there are no T-bone steaks in England and no roasts in the US, would it mean that one of those traditions is doing it wrong, or that there are different ways to carve the meat at its joints? On this issue, we can distinguish between the position of the monists and that of the pluralists.

Natural kind monists hold that there is only one correct way of dividing the world into natural kinds, of carving nature at its joints. In such a view, no crosscutting classifications should be considered natural kinds. In case there is any overlap between different kinds, one must be a sub-kind of the other. This claim is known as the hierarchy thesis regarding natural kinds (Khalidi 2013). The isotopes of hydrogen, for instance—protium, deuterium, and tritium—can be said to constitute a sub-kind of the kind hydrogen. That is, they have the same atomic number, but different numbers of neutrons in the nucleus. Accordingly, a monist either claims that there is one natural categorization of entities in the world, and it must apply only to the lowest possible level of classification, or, if there are higher-level natural kinds, they should form a hierarchy that bottoms out at the lowest level. From this we can see that monists do not necessarily need to endorse the hierarchy thesis.

Natural kind pluralists, on the other hand, countenance different ways of classifying entities into natural kinds. In their view, entities can be cross-classified in different ways, depending on the purposes that these classifications serve. We can classify biological organisms, for instance, into species if we are interested in their ancestry or breeding patterns. But we can also classify them into ecological groupings, for instance, that of detrivores, which refers to organisms that consume decomposing organic matter and encompasses a wide array of organisms, from fungi and worms to some bacteria. In the pluralist view, we cannot claim that one of these classifications is superior and ought to be endorsed at the expense of the other. Rather, both can be useful and equally valid depending on the purposes and contexts of scientific investigation. Pluralists are not typically associated with endorsing the abovementioned hierarchy thesis, since they have no problem allowing crosscutting classifications. But a pluralist can hold the view that different classification systems, responding to diverse scientific interests, still have to be hierarchically ordered. Even if the hierarchy thesis is normally associated with a monistic approach, therefore, the monism-versus-pluralism question and the idea of a strict hierarchy of natural kinds are conceptually distinct.

b. How to Identify Natural Kinds: Their Role in Inductive Generalizations, Scientific Laws, and Explanations

So far, we have been dealing with very general questions concerning whether the world can be divided into certain privileged groupings. If indeed there are such groupings, that is, natural kinds, then it is worthwhile to establish the criteria for something to be a natural kind. Different accounts of natural kinds ascribe different features to them, but all of them, at a minimum, presuppose the following: The entities classified into a kind should share a set of common properties by which they are grouped together. This grouping of common properties ought not to be accidental. To illustrate this, we can think about cases in which we group entities together based on observable properties and then establish that there is a common cause that accounts for those properties. We note, for instance, that sunflowers (Helianthus annuus) share common observable properties—a large, usually yellow, flower head, a tall, erect stem, broad and rough leaves, and so on—and conclude that there must be an underlying explanation for such a clustering of properties. This explanation draws on the fact that all sunflower plants belong to the same species, which points to a common cause for the common properties. Regarding species in general, this common element might stem from shared ancestry or an ecological niche, exchanging genetic material through interbreeding with other species members, and so on.

The properties shared by members of a natural kind need not be directly observable. In many accounts, chemical elements are considered to be standard examples of natural kinds for which important properties shared by members of the kind are not directly observable. Take carbon, for instance. It is well known that different structures of carbon atoms constitute materials of extremely different properties, such as diamonds and graphite. Nevertheless, both diamonds and graphite are taken to be composed of the same element because they share a deep property, namely, the microstructure.

These features of natural kinds can help us see why it is useful to classify the world into such categories and indicate why natural kinds are commonly taken to play an important role in inductive inferences, scientific laws, and explanations. Let us briefly examine how the debate on natural kinds is entangled with these key issues in the philosophy of science. Classifying things into kinds according to their shared properties is theoretically and practically significant because it normally countenances inductive inferences about the members of kinds. Our previous encounters with sunflowers, for instance, allow us to infer some properties and behaviors related to this species, such as that they grow best when exposed to plenty of sun, in fertile, moist and well-drained soil; that they can be used to extract some toxic ingredients from the soil, such as arsenic or lead, and so on. Establishing the existence of stable, clustered properties associated with sunflowers thus underpins the inductive inference that future observed instances of this kind will also share some or all of those properties. This enables us to formulate relatively precise instructions for plant cultivation.

Natural kinds also play an important role in laws of nature or scientific laws. How this role is characterized and explained depends on the exact account of scientific laws one endorses (see the article on Laws of Nature). Consider copper as a candidate for a natural kind. All instances of copper share some common properties: They are soft, malleable, and ductile, with a reddish-orange color. These observable features can be accounted for by the atomic structure of copper, namely that it has a nucleus containing 29 protons and 34 to 36 neutrons and it is surrounded by 29 electrons localized in 4 shells. Like other metals, it consists of a lattice of atoms and has a single electron in the outer shell that does not remain connected to particular atoms but forms an electron cloud spreading through the lattice. This cloud, containing many dissociable electrons, makes the conduction of electric currents possible. These facts about the atomic structure of copper allow us not only to infer that a subsequently observed instance of copper will conduct electricity, but also to establish it as a scientific law of the following form: “All pure copper conducts electricity.”

The plausibility of this assumption about natural kinds depends on how stringently we construe natural laws. For instance, it is often taken that laws are necessary, exceptionless, and universal. Specifically, natural kind essentialists, as further explained in section 4.a, hold that there ought to be some common properties, that is, essences, shared by all and only members of a kind. The existence of these unique properties would, in turn, ground the idea that laws of nature necessarily hold with respect to members of natural kinds. In this view, it also follows that natural kinds ought to be categorically distinct; that is, there can be no continuum or smooth transition between different kinds. Rather, there should be some natural boundaries between them. Many authors argue, however, that essentialism is not the best account of natural kinds because it excludes many scientific categories such as those in biology, psychology, and other special sciences that do not fulfill its demanding criteria (Dupré 1981, Khalidi 2013).

The assumption that natural kinds play an important role in inductive inferences and scientific laws explains the widespread belief that natural kinds are important for scientific explanations. We saw how the atomic structure of copper explains its observable properties, such as electric conductivity. Establishing a common cause or mechanism that accounts for the grouping of properties in nature also provides an explanation for the behavior of entities thus classified. It must be noted, however, that the role natural kinds play in scientific explanations also depends on the notion of scientific explanation that one endorses (see the article on Theories of Explanation).

Thus far, the assumption has been that natural kinds are characterized by shared common properties, which in turn account for their role in inductive generalizations, scientific laws, and explanations. However, if we start with the assumption that natural kinds are those categories that play important roles in scientific inferences and theories, we ought to address the question of whether functional kinds, which are important in many scientific disciplines, are natural kinds. This issue is addressed in the next subsection.

c. Natural Kinds and Functional Kinds

Functional kinds are defined as groups of entities united by a common function—that is, by their activities and causal roles. Common examples include biological kinds, such as predator and prey; psychological kinds, such as pain; and artifact kinds, such as knives. What connects all these examples is that the entities in question are grouped together because of something they do, and not because they share similar underlying properties. Very different species of animals can belong to the predator category, such as jaguar, human, rattlesnake, or stork. Similarly, very different kinds of things can be used as a knife, from a piece of a sharp stone or glass to steel blades specifically manufactured for cutting food. This phenomenon is referred to as multiple realizability of functional types or kinds (see the article Mind and Multiple Realizability), and it has been a widely discussed topic in the philosophy of mind with regard to mental kinds, like pain.

On the one hand, one might argue that mental kinds, such as pain, cannot be taken to be natural kinds because they cannot be reduced to paradigmatic physical kinds. It is plausible that very different types of creatures can feel pain. For instance, it is plausible that humans, squids, and snakes can experience pain, although they have very different types of neurophysiological architectures. If pain can be realized by different physical states, however, then it seems that pain could only be a “widely disjunctive” and disunified kind, in the sense that in humans it is realized by one set of neuropsychological states, in squids by another, in snakes by still another set, and so on and so forth for different species. Some authors have concluded on these grounds that it is impossible to unify or reduce categories of special sciences to the more basic categories that we find in the physical sciences, which provide paradigmatic examples of natural kinds (Fodor 1974).

On the other hand, functional kinds, such as pain, play important roles in scientific explanations in various disciplines of special sciences, psychology being the most prominent example. Thus, if they play such an important role in the special sciences, it is worthwhile to examine them as candidates for natural kinds in such disciplines. Some see the fact that functional kinds play such an important role in scientific explanations as a reason to assume that they are not really multiply realizable and thus widely disjunctive, and that the properties important for realizing a function need to be shared by the entities grouped together. Alternatively, other authors argue that natural kinds can be multiply realizable and that functional kinds can be considered instances of natural kinds (Ereshefsky and Reydon 2015).

d. The Increase of Interest in Natural Kinds in the Twentieth Century

The topic of natural kinds gained momentum in the second half of the twentieth century in relation to two philosophical debates: the debate on paradoxes of confirmation and inductive inferences in the philosophy of science and the debate on theories of reference in the philosophy of language. Let us start with the first issue, since it relates to the aforementioned role that natural kinds play in inductive inferences.

Views of natural kinds that emphasize their role in inductive inferences face Goodman’s new riddle of induction (see the article on Confirmation and Induction). Nelson Goodman (1983) argued that there are innumerable ways to draw inductive inferences from a given data set. For instance, from the same data set consisting of green emeralds, we can infer either that all emeralds are green or that all emeralds are “grue,” a word Goodman invented for the purpose of this argument. “Grue” is a predicate that is defined relative to some fixed time: Something is grue if it was observed prior to the year 2050, and is green, or it is observed after the year 2050 and is blue. Drawing the inductive inference that all observed instances of grue emeralds allow us to conclude that all emeralds are grue, which leads to a paradoxical situation in which observing instances of green emeralds in the past can serve as an inductive basis for inferring that in the future, blue emeralds will be observed. We consider the induction based on the concept “green” to be acceptable and reject the induction based on “grue.” This indicates that the choice of kind concepts matters for preferring certain inductive inferences. The question arises as to how, or on what basis, we can draw the line between concepts that are suitable for inductive generalizations and those that are not.

Willard van Orman Quine (1969) introduced natural kinds as a solution to Goodman’s grue paradox and argued that what makes concepts projectable and suitable for inductive generalizations is the fact that they refer to kinds. Natural kinds are sets whose members share similar properties. This does not entirely solve Goodman’s problem, however, since, according to Quine, natural kinds rest on an even more problematic notion of similarity. That is, to know how to classify objects into kinds, we already need to have an account of what makes properties similar in relevant aspects. In his view, our standards for judging similarity are preset, that is, they are a part of our cognitive setup, and are needed for any learning to occur. The main question is why we should assume that our similarity standards track some real groupings in nature. Quine’s answer is that we are successful in making inductions because our similarity spaces have evolved through natural selection, by a process of trial and error. Goodman’s solution to the problem he articulated is, simply, that certain concepts, for example, “green,” are better entrenched in our usage and language than others, such as “grue.” This means that we have used them more and have been successful in doing so. Thus, groupings that have proven to be inductively successful have become entrenched in our language.

A natural kind essentialist answers this problem by claiming that concepts suitable for inductive generalizations are those that correspond to the real, mind-independent groupings in nature and are characterized by shared essences. Non-essentialists, however, cannot endorse this answer because they contend that we do not have access to mind-independent divisions, even if they exist. They do not think that we can identify certain properties that all and only members of a kind share and in virtue of which they belong to natural kinds.

A different route to the topic of natural kinds was the debates on theories of reference, specifically, Saul Kripke’s (1972) and Hilary Putnam’s (1975) essentialist views on natural kinds. These views were inspired by the problems of the semantics of natural kind terms. Both Kripke and Putnam argue against descriptivist theories of meaning of natural kind terms (see the article on Gottlob Frege) that identify the meaning of a term with the description of properties associated with that term. In the case of the term “water,” for instance, the description that it is a clear, odorless, colorless, and drinkable liquid fixes its meaning. Kripke and Putnam argue, instead, that even if all the descriptions we associate with a natural kind term are false, we can still refer to that kind.

In Putnam’s Twin Earth thought experiment (see the article on Internalism and Externalism in the Philosophy of Mind and Language), he asks us to imagine a situation in which there is Twin Earth, a planet that is exactly like Earth except for one difference: Instead of water, there is superficially the same liquid, but with a different chemical composition. That is, instead of H2O, it consists of XYZ. People on Twin Earth also refer to this liquid as “water.” But if we ask whether people on Earth and people on Twin Earth refer to the same stuff when they say “water,” the answer seems to be no. This means that there is more to reference than the description associated with a kind term. Putnam shows that something external to the user, namely the objective causal relations with the referent, are relevant for fixing the meaning of natural kind terms. What characterizes all instances of a kind is the fact that they bear some relation to other members of a kind; in the case of water, this is the relation of being the same liquid, that is, having the same chemical microstructure with other samples of water.

Kripke and Putnam advanced and popularized an essentialist view of natural kinds that many considered to be acceptable because it did not construe kind essences as elusive properties, but as something discoverable by scientific inquiry. This view, however, provoked reactions from philosophers dealing with special sciences, such as biology, psychology, psychiatry, and so forth, where scientific classifications do not fulfill the essentialist criteria (for an exhaustive overview and criticism of Kripke-Putnam’s version of essentialism, see LaPorte 2003). This has led to accounts of natural kinds that aim to loosen the criteria that determine which categories can constitute them, the most popular being the clustering accounts of kinds that take natural kinds to pick out clusters of properties, where members of a kind do not need to share unique essences, but rather, a certain amount of common properties where these properties are shared for nonarbitrary reasons. Section 3.b further discusses clustering accounts.

Other, more metaphysically minded philosophers, inspired by the work of Kripke and Putnam, started to develop an approach that has been termed scientific essentialism (Bird 2007, Ellis 2001). This view claims that the fundamental laws of nature hold because of essential properties of natural kinds. Thus, given that natural laws are grounded in the natural kind structure of the world, it is their essences that explain why the laws of nature are, in fact, metaphysically necessary. Roughly put, entities in the world must behave the way they do because of their natures. Scientific essentialists are usually concerned with fundamental kinds such as electrons, whose essential properties, like electric charge and mass, cause all their lawful behaviors.

The abovementioned contrasting reactions to essentialist views on natural kinds reflect a more general juxtaposition on how to approach the natural kinds debate. On the one side, there are authors, such as the scientific essentialists, who are more interested in the metaphysical problems and conceive of natural kinds as the most fundamental groupings of entities in the world. They tend to endorse very rigorous views on what it takes for a kind to be natural. Certain interpretations of essentialism are compatible with such an approach. On the other side, there are authors who are mainly oriented toward actual scientific practice and tend to assume that successful scientific classifications can be used as paradigmatic cases of natural kinds, and that the job of the philosophical accounts of natural kinds is to track the main features of such classifications and offer an account of natural kinds that will be able to encompass the scientific practice (Kendig 2015).

The next section provides an overview of the three most prominent accounts of natural kinds, starting with essentialism. The overview follows this general tendency to start with a strict philosophical account of natural kinds, and then to offer more relaxed criteria that take into consideration the data coming from the practice of scientific classification. Even essentialism, as the most demanding view, has been interpreted in different ways with the aim of capturing existing scientific categories. After essentialism, two more encompassing views are presented: cluster kinds, a view that emphasizes the clustering of properties specific to members of a kind without requiring the possession of unique kind essences, and, finally, the category of promiscuous kinds, which is the most liberal, allowing for members of a kind to have a small number of shared properties if they serve certain explanatory purposes.

2. Three Views on Natural Kinds: Essentialism, Cluster Kinds, and Promiscuous Realism

The three main views on natural kinds—essentialism, cluster kinds, and promiscuous kinds—are illustrated using specific examples from different scientific disciplines. The chemical elements are used to exemplify essentialism, since they are the most commonly used example of essentialist categories. The cluster kind view has been advanced as a reaction to the inadequacy of essentialism to capture many scientific classifications; biological species, being the most prominent among them, will be used to illustrate this account. Lastly, promiscuous realism, the most relaxed account of natural kinds, will be illustrated by invoking the example of psychiatric categories, which many consider to be highly disputable candidates for natural kinds. Since promiscuous realism allows even folk categories to count as natural kinds and allows for a vast range of interests to play an important role in establishing what constitutes a natural kind, psychiatric categories represent an interesting case study in which both scientific and practical concerns may be taken for establishing which classifications ought to be taken as relevant. It needs to be emphasized that the decision to illustrate the main accounts of natural kinds with these specific examples does not imply that these accounts are suitable only for those categories or those disciplines. Often, even though not necessarily so, authors proposing an account of natural kinds assume that it can be applied to all instances of natural kinds, regardless of the scientific discipline in question.

a. Essentialism: The Case of Chemical Elements

According to essentialism, natural kinds are groupings of entities that share a common essence—intrinsic properties or structure(s) uniquely possessed by all and only members of a kind. An intrinsic property is a property that an entity has independently of any other things, while an extrinsic property is the one that a thing has in virtue of some relations or interactions with other entities. The basic idea is that the essence causes and explains all other observable shared properties of the members of a kind and allows us to draw inductive inferences and formulate scientific laws about them. Chemical elements are used as standard examples of paradigmatic candidates for essentialist natural kinds. Their intrinsic properties—that is, the structures of their atoms—determine their observable properties. Take the case of hydrogen: Its atoms consist of a single proton in the nucleus and a single electron in the atomic shell. The structure of hydrogen atoms determines the bonds it can form with other entities and compounds, such as the molecular structure of the chemical compound H2. These molecular forms then determine other properties of hydrogen, such as its colorlessness, odorlessness, tastelessness, and high combustibility at normal temperatures. They also account for its prevalence in molecular forms, such as water and organic compounds, because it has a disposition to form covalent bonds with nonmetallic elements. This makes the atomic structure of hydrogen its essence, a property that is shared by all hydrogen atoms and not shared by atoms of any other element.

Many essentialists think of the periodic table of elements as a perfect illustration of how things in the world are divided into natural kinds. In our exploration of nature, we can find different substances with a range of properties, but a further examination shows that they all belong to some basic categories clearly distinct from one another. Upon further examination, we find that this distinctiveness is a consequence of their intrinsic properties. The fact that chemical elements form natural kinds in virtue of their shared essences accounts for the fact that they ground scientific laws and inductive generalizations. For instance, knowing that something is a hydrogen gas allows us to infer that it will spontaneously react with chlorine and fluorine at room temperatures, thus forming potentially hazardous acids.

Essentialism requires natural kinds to be discrete or categorically distinct. Alternatively, if there were smooth or continuous transitions from one kind to another, this would mean that we should decide, perhaps arbitrarily so, where to draw the line of demarcation between them. The essentialist holds that essences are supposed to provide us with an objective criterion for where to draw such lines. Exactly the requirement that members of each kind ought to share a unique essence excludes vague or unclear cases when we cannot clearly determine to which kind an entity belongs. Brian Ellis (1999), for instance, takes the discreteness of chemical categories as scientific evidence that the world is structured into essentialist kinds. He contends that if there were a smooth transition between different kinds, then the demarcation between them would not be drawn by nature; rather, we would have to decide where to draw the line.

Essentialists are typically, but not necessarily, monists. They typically hold that there is a single correct way of dividing the world into natural kinds. In this view, it might seem that categories are natural only if they constitute a unique way of organizing phenomena under investigation. In that case, there could be no crosscutting categories in the domain under investigation. Humans and dogs, for example, are classified into the category mammals, but dogs and crocodiles (which are not mammals) can be classified into the category quadruped. In such cases, a monist ought to claim that one of these categories is not a natural kind, that is, for instance, that dogs and crocodiles are natural kinds, while quadrupeds are not. It appears though, that monists can accept overlapping classifications if they are hierarchically ordered, which means that in cases in which there is overlap between two different kinds, one must be a sub-kind of the other. Linnaean taxonomy is an example of a hierarchically ordered classification with seven different ranks of classification, starting with species at the lowest level, and ending up with kingdom at the top as the widest category, encompassing all the others. Humans are thus classified into the species Homo sapiens, but also into the class mammals and the kingdom animals, but species is a subcategory of a class, and a class is a subcategory of kingdom.

Nonetheless, the idea of crosscutting categories is not necessarily incompatible with essentialism. In nuclear physics, if we focus on patterns of radioactive decay and the stability of elements undergoing decay, then chemical elements can be classified in a way that crosscuts standard classification as captured by the periodic table. Radionuclides, for instance, are unstable atoms with excess nuclear energy that undergo radioactive decay. They can occur naturally or artificially. Examples include tritium, a radionuclide and an isotope of hydrogen, and carbon-14, a radioactive isotope of carbon. If we were to build a classification system that is based on the stability of radioactive atoms, it would be different from the standard chemical classification into elements, but one can still argue that it would track certain essences or essential properties.

In the philosophy of chemistry, microstructuralism is the essentialist view according to which chemical kinds ought to be individuated solely according to their microstructural properties (Hendry 2006), like the nuclear structure represented by the atomic number for chemical elements. While higher-level, observable properties can be used to identify what kind some entity belongs to, the microstructure has explanatory priority, and is the real arbiter of whether something belongs to a kind, because it is responsible for all the other properties and relations into which the entity can enter. The problem that microstructuralists face, however, is whether they can demonstrate that microstructural properties really have this potential and specify what the relevant microstructural similarities are. In addition, they need to explain why, in general, we should privilege groupings based on microstructure, as opposed to some other way of classifying things.

Let us grant, for example, that the essence of water is its H2O molecular structure. If we take an individual molecule of water, it will not have the observable properties we commonly associate with water. Moreover, water is more accurately described as containing H2O, OH, H3O+ and some other less common ions. The problem is not that we do not know what the microstructure of water is; the problem is that there is no one microstructure responsible for the observed properties. In fact, the observed properties are a result of very complex and ever-changing interactions. It is correct to say that the average ratio of atoms of H and O is 2:1 but the observable properties of water do not depend upon this ratio. Rather, they depend upon the interactions between the dissociated ions.

It is far from straightforward to specify what exactly structural similarity amounts to, since this appears to be a matter of degree. It is unclear how much microstructural similarity is enough to individuate a natural kind. If we are focusing on the nuclear properties of atoms, we can target nuclear charge, where the atomic number—that is, the number of protons in the nucleus—is relevant for establishing a kind, in this case the kind chemical element. Alternatively, we can target nuclear mass, and reach a classification into isotopes. Isotopes have the same nuclear charge and undergo the same reactions at different rates, but the differences between them can be significant. Take the example of isotopes of uranium: uranium-235 and uranium-238. These isotopes differ not only in the number of neutrons, but also in other important properties, for instance, in how radioactive they are. Furthermore, take the example of chiral molecules, which have a similar structure but different dispositions due to their components being differently geometrically configured, one being a mirror image of the other. The question can be posed as to whether enantiomers—molecules that are mirror images of chiral molecules—form a separate kind according to microstructuralism.

When we go to the classification of macromolecules such as proteins, we reach a problem of justifying classification based on microstructural properties, since they are standardly individuated by their functions. This has led some authors to propose a pluralism about macromolecular classification (Slater 2009). Microstructuralists can accept a form of pluralism as long as the kind essences are microstructural properties. Introducing functional properties of the macromolecules, however, goes beyond the scope of microstructuralism. Essentialists, more generally, as was already noted, can accept such pluralistic positions and allow, for instance, that the classification of chemical elements based on their atomic number stems from an interest in explaining particular material transformations and that chemical classifications might have been very different if we had started out with different interests (Hendry 2010). Thus, if we are interested in the behavior of biological macromolecules, we can classify them according to function rather than structural properties.

Such pluralist forms of essentialism can encompass a much wider range of scientifically interesting categories, but at a cost of reducing the importance of the role played by essences in causing and explaining all other properties typically associated with kind members. If we allow that a diverse range of interests tracks different essences, and we group the same entities into many crosscutting classifications, then essences would not play as important a role as has been assumed. The basic essentialist idea is that when we know which natural kind an entity belongs to, we can infer many important properties of that kind, exactly because the essence is responsible for all those shared properties. If, on the other hand, there are many different essences that we can track, and that, accordingly, enable the grouping of the same entities into different, crosscutting categories, then knowing the essence and which category an entity belongs to would not give us full information about the entity we are investigating. Rather, it would give us only partial information, depending on specific interests that lead us to investigate some group of entities.

Perhaps the most powerful objection leveled against essentialism is that it is inapplicable to kinds in many non-fundamental sciences. Biological species, for instance, which were taken as standard examples of natural kinds, do not fulfill essentialist requirements. Moreover, essentialism seems to be incompatible with the Darwinian theory of evolution. There are no properties of species that all and only members of a species share. But even if we were to find some, we would expect that they could easily be changed by evolutionary mechanisms, such as mutation, recombination, and random drift. These considerations have led many authors to conclude that essentialism is not a satisfactory view of natural kinds (see, for instance, Sober 1994, Wilson, Barker and Brigandt 2007) and to declare “the death of essentialism” (Ereshefsky 2016).

The essentialists respond by restricting natural kinds to the more fundamental sciences, such as physics and possibly chemistry (see, for instance, Ellis 2008). The idea is that natural kinds refer to the groupings discovered by those sciences and the scientific classifications of higher-level sciences do not refer to natural kinds. Other philosophers reconceptualized essentialism to countenance essences as extrinsic or relational properties and not only as intrinsic ones. The property of being a descendant of a certain ancestor, for example, might be essential for belonging to a species. Acceptance of such a view, however, represents a significant departure from standard essentialism. Even though being a descendant of a Canis lupus familiaris might be a necessary and sufficient condition for belonging to the kind dog, this relation does not seem to play the role standardly associated with kind essences. The main motivation for essentialism is that possessing an essence accounts for the similarity of members of a kind. If we have a case where we can point to some common essence, but this essence does not guarantee that members of a kind will share important properties, then the essence does not play the role it was supposed to play. The fact that members of some species share a certain ancestor can cause many similarities between them, but there might also be many significant differences between them. Different breeds of dog, for instance, share a common ancestor, which makes them similar in certain respects, but they are also dissimilar in salient respects. For example, Siberian Huskies, with their double layered coats, are adapted for cold environments, while Border Collies are well equipped to withstand heat. Thus, sharing a common ancestor does not in any way guarantee that members of a kind will share a certain set of properties and thereby does not play the role that essence is supposed to play.

The problems that arise when we try to apply an essentialist account to many categories in the special or higher-level sciences, most notably, to the category of species in biology, have prompted relaxing the constraints proposed by essentialist views. The most established of such reactions is the proposal that natural kinds should be identified with clustered properties and not essential ones. In the next section, the cluster approaches to natural kinds are presented through the example of biological species, one of the main examples used to illustrate the adequacy of cluster approaches, especially in opposition to essentialism.

b. Natural Kinds as Property Clusters: The Case of Biological Species

Cluster kind approaches offer a less strict view of natural kinds. In accordance with these views, to belong to a kind, its members need not to share a set of necessary and sufficient properties; it is enough that they share some subset of properties that tend to cluster together due to some underlying common causes. The main idea is that nature is structured in such a way that properties are not randomly distributed across space-time; rather, they are systematically “sociable” (Chakravartty 2007), in the sense that families of properties form stable clusters. Natural kinds are categories that pick out such clusters of properties. This is a much more encompassing view than essentialism because none of the properties are necessary for kind membership, it is sufficient that some of them are shared, and there is no requirement for a clear-cut division between members of a kind and nonmembers.

Many philosophers of biology recognized the inadequacy of essentialism to account for species (Hull 1965, Sober 1994). It is hard to find traits that are uniquely shared by all and only members of a single species. According to evolutionary theory, any common trait can easily be changed through mutation, drift, or recombination. Since selection acts upon differences between traits, variation, rather than similarity, is the rule in the biological world and the fuel of evolution. Thus, practicing biologists do not classify organisms by identifying something like an essence that species members share; they do it by tracing phylogenetic relations (that is, ancestor-descendent relations or the evolutionary history of species members), interbreeding patterns, ecological niches, and so forth.

These considerations prompted a view that species are individuals, rather than biological kinds (Ghiselin 1974, Hull 1978). Similarly to functioning organisms, individuals are ontologically characterized by having spatiotemporally restricted and causally interconnected parts. In this view, to belong to a species does not mean that its members share some common properties but, rather, that they belong to an evolving lineage whose parts causally interact.

Richard Boyd (1991, 1999) introduced the Homeostatic Property Cluster (HPC) theory as an alternative view that can accommodate the idea that species are natural kinds. HPC characterizes natural kinds as clusters of co-occurring properties underpinned by homeostatic mechanisms that cause and sustain the property clusters. According to Boyd, biological kinds are good candidates for a natural kind cluster; species members share many, but not necessarily all, properties that are caused by various mechanisms, such as sharing a common ancestor, sharing an ecological niche, gene exchange, or common developmental mechanisms. This view allows for the possibility that there are many variations and differences between members of a species, while acknowledging that traits and properties of members of the same biological species are clustered due to the aforementioned mechanisms.

A problem for the HPC view is that, in some cases, properties of members of a kind need not be products of underlying homeostatic mechanisms. Since members of species vary in traits, for example, they might also vary in the underlying mechanisms that cause them. Consequently, we can have different underlying mechanisms distributed across a species that cause different traits in species members, such as human blood types, which are caused by different underlying genetic mechanisms. Unless we have a criterion for which of the traits and their underlying mechanisms are somehow more important or essential, it is up to us whether we will focus on the shared mechanisms that cause similarities between species members, or on the ones that are heterogeneous and case differences between species members.

In addition, it has been claimed that the focus on underlying mechanisms is too restrictive and diverts attention from what really needs to be explained, that is, is the stability and cohesiveness of properties that occur together. Thus, even less restrictive accounts have been offered, such as the Stable Property Cluster (SPC) view (Slater 2014).  In this view, a grouping is considered a natural kind if it consists of clusters of stable properties and this stability is due to the instantiation of some of the properties that warrant a probabilistically reliable inference that other properties are instantiated as well. For this inference, it is not necessary to trace the underlying causes of such stability.

The main advantage of HPC and related clustering views—that they are much more permissive than essentialism—can also prompt worries. Unless one is very strict on how to individuate clusters of properties and/or their underlying causes, these accounts have the problem of determining how many clustered properties are enough to consider something a natural kind. A potential worry is that these accounts are overly liberal and that any clustering of properties might comprise a natural kind. This would go against the commonsense intuition that natural kinds pick out groupings that are in some sense privileged.

There are authors, nonetheless, who do not see this as a problem and who defend a view that natural kinds should not be considered some privileged subset of categories (Dupré 1981). According to such views, there are many sameness relations in the world that we pick out depending on our interests, and they all can qualify as natural kinds. The next section reviews such account, promiscuous realism. As its name suggests, this account allows that many diverse interests can play a role in determining which grouping should count as a natural kind, thereby substantially expanding the set of categories considered natural kinds. Promiscuous realism is illustrated through psychiatric classifications. Many consider psychiatric categories to be problematic because there is often much heterogeneity among category members and there are many possible interests and relevant criteria for picking out psychiatric groupings. However, one cannot deny that those categories are scientifically useful and play an important role, both in scientific research and in practical contexts, which makes it interesting to examine whether there is a suitable philosophical account that might capture such categories, and promiscuous realism appears to be a fitting candidate.

c. Promiscuous Realism: The Case of Psychiatric Categories

According to promiscuous realism, depending on our interests and aims, there are many ways of classifying entities into kinds. This position was introduced by John Dupré (1981) and a similar view was also proposed, under the name of pluralistic realism, by Philip Kitcher (1984). Dupré holds that there are many sameness relations that can be used to distinguish different natural kinds and that none of those relations are privileged. That is, different entities can share some similarities with members of one group and some with members of another group, and which group we pick out as relevant will depend on our interests. This view is realist because it involves the criterion that something counts as a natural kind if its members share at least some similarities, even if minimal. Those similarities need to be some objective features of the world and not facts about us. For example, the fact that we group some things together would not count as a common property that can serve as a basis for classification.  This view therefore excludes as nonnatural classifications of entities which do not share any common properties. Different aims and interests will tend to produce different classifications, and those classifications can be taken as natural kinds if the members share at least some common properties that cause those entities to be categorized together in the first place.

While the cluster kind approaches have the problem of specifying where, exactly, to draw the line between clusters of properties that correspond to natural kinds and those that do not, promiscuous realism sidesteps this issue. If divisions between kinds come on a continuum and there are no clear cutoff points, promiscuous realism allows us to regard as natural kinds all classifications that group together entities that have at least some objective properties in common. This does not mean that all classifications are on equal footing. We can still consider some to better serve our purposes than others or to be better used in different contexts, but all of them can be considered natural in this minimal sense. This is a much more sweeping account of natural kinds because it countenances a wider range of categories as natural.

Dupré introduced this view by offering the example of different crosscutting categorizations into species, depending on which species concept is used in various biological subdisciplines, and classification practices outside biology. One of the hallmarks of promiscuous realism is that it does not prioritize scientific classifications over folk categories. Dupré (1981) provides examples of cases in which folk classifications do not correspond to biological classifications. For instance, our classification into butterflies and moths cross-classifies with the biological one. In fact, in many cases our classifications will be coarser-grained or finer-grained depending on our interests. What we call lilies, for instance, belong to the numerous genera of the lily family (Liliaceae), but our folk naming practice does not include the entire family, since we exclude onions and garlics that also belong to the same family. Dupré’s argument is that we should not try to change our folk categorizations to correspond to scientific ones because they often serve different purposes. We sort some plants of the lily family together because of their aesthetic properties, while we exclude garlics and onions because they serve culinary or other purposes. All these classifications can be considered natural and we can use one or the other depending on our interests and aims.

The promiscuous kind account has also been recognized as suitable for psychiatric classifications. It might seem that a psychiatric classification normally picks out a homogenous group of symptoms whose underlying cause(s) can be discovered and consequently treated, as described by the cluster kind accounts. This, however, is not what we often find in the actual practice of psychiatric classification. In this context, it seems even harder than in biology to find a stable cluster of common properties like symptoms or behaviors that are underpinned by a joint causal mechanism (Cooper 2012). Derek Bolton (2012) argues that the standard approach of classifying psychiatric conditions, starting with surface characteristics and then looking for their etiology to ensure reliability, is not as fruitful as it was initially assumed and that we should stop hoping that the etiology of psychiatric conditions will deliver one optimal classification scheme. Depending on our interests and pragmatic considerations, different types or subtypes of psychiatric categories will be taken as relevant. Thus, we might start using different sets of criteria to identify schizophrenia depending on our research interests. Those who are interested in treatment might parse out the symptoms and other criteria differently than those searching for the genetic causes of the condition. For this reason, it seems that promiscuous kinds can better account for classifications in psychiatry.

The fruitfulness of this approach can be illustrated by the introduction of biomarkers for marking biological correlates of different psychiatric conditions. The idea is to identify a biological causal chain or its correlates—for example, specific brain activation patterns—that underlie psychological and social characteristics associated with a psychiatric condition. The paradigmatic success story of this approach is the case of neurosyphilis, a disease that is characterized by psychiatric symptoms that are caused by the bacterium spirochete Treponema pallidum. Another example is a large project aimed at collecting genetic, biochemical and imaging data for a population that has a high risk for Alzheimer’s disease. This has led to proposals for new classification schemes, based on biological features that measure the presence of a disease.

The problem with this type of approach is that it is not justified to expect that for every familiar psychiatric condition, normally identified by behavioral and psychological symptoms, we will find a common pathway that underpins those psychiatric symptoms, as in the case of neurosyphilis. In fact, not many have been found (Buckholtz and Meyer-Lindenberg 2012). More often, we find a diversity of symptoms with diverse etiologies constituting one psychiatric condition. While for some this might constitute a reason to discard psychiatric conditions, or most of them, as candidates for psychiatric natural kinds (for a discussion, see Murphy 2017), promiscuous realists would allow that even a minimum of shared properties is enough to consider something a natural kind, if the classification serves some purpose.

While promiscuous realism has the advantage of encompassing many classifications that would not be considered natural on the cluster accounts, it can be objected that it is too liberal in doing so. All categorizations that are at least minimally grounded in the causal structure of the world can be considered natural kinds.  This might seem odd; since one of the starting points of the debate was the intuition that the objective structure of the world allows us to pick out some privileged groupings, ordinarily it is taken that those are the ones discovered through scientific inquiry, and that such groupings are superior to our everyday folk categories. In the promiscuous realist view, we can still privilege certain groupings, like the scientific categories, as being more explanatory or predictive, but it is not so that these groupings are natural and that the folk ones, for example, are not. All those categories can be considered natural kinds, and to prioritize some over others will have to be justified by invoking our interests.

This does not necessarily present a problem since, in some contexts, we do not need scientific classifications. When cooking, for example, we might have more use for “The Scoville Heat Scale,” a measure of the hotness of chili peppers according to the concentration of capsaicin, a chemical compound that produces heat sensation, than for the botanical classification of chili pepper plants. In the context of scientific research, however, we could use some further guidance for favoring classifications that are better grounded. Thus, it seems desirable to have a further requirement that goes beyond some minimum of shared properties that can serve some or other purpose or interest. Promiscuous realists can respond by stating that the relevant properties and interests ought to be related in systematic ways. Additionally, we can refine the demands on scientific classifications by adding constraints on the purposes that classifications serve. While the classification of people into right-handed and left-handed, for example, is based on a property that members of these groups share, and it can serve some minimal purposes like informing us what kind of scissors to produce, it is not a very useful category because it is minimally informative. Thus, we can add that we should favor those scientific categories that are information rich and that can accommodate many of our interests.

To go back to psychiatric conditions, while, for some purposes, it might be useful to group together shy people or anxious people, these groups are commonly considered to be too heterogeneous to be considered natural kinds. The introduction of constraints on classifications that are focused on our interests and aims, and not only on the amount or importance of shared properties, brings us to the question of whether we ought to consider natural kinds to be groupings that exist independently of mind and can be discovered by us, or whether which kinds we consider natural is always related to us, the investigators. The first thesis is associated with a realist understanding of natural kinds, and the second one with an antirealist understanding, but one should be very careful in formulating what exactly it means to be a realist or antirealist about natural kinds. The next section further examines this issue and provides a taxonomy of the various realist and antirealist positions. The section also problematizes the reason the three main accounts of natural kinds presented in this section are commonly taken as realist views and discusses how to differentiate different antirealist views according to the way they demarcate which interests are taken as relevant for establishing what constitutes a natural kind.

3. Metaphysics of Natural Kinds

a. What Does It Mean that a Kind is Real?

Realism about some entity or domain P states that P exists, and that it exists independently of us, the cognizers, that is, independently of our classificatory practices, conceptual schemas, beliefs, values, and so on. One can be a realist about everyday objects, for instance, such as chairs, rocks, buildings, and trees—but also about intangible entities like numbers or moral value. Those who are antirealists about everyday objects, numbers or moral value generally do not claim that such things do not exist tout court. Rather, they hold that such entities depend on us and would not exist were there no creatures who can respond to them. Accordingly, natural kinds realists are committed to the view that natural kinds exist independently of mind. When we talk about entities belonging to a kind, it seems straightforward to establish what it would mean that they exist independently of mind. For instance, on one hand, mental states are necessarily mind dependent. On the other hand, most people will agree that rocks and mountains exist independently of mind, that is, they would exist even if there were no one to perceive them or think about them. When we talk about groupings of such entities into kinds, however, there are at least two possible interpretations of the claim that groupings themselves are mind independent.

In one interpretation, to say that natural kinds exist independently of mind means that they exist as separate entities. Usually, this claim is taken to imply that natural kinds are universals, a special type of repeatable entity that can be instantiated with many particular objects (see the article on Universals). Realism about natural kinds as universals has been called strong realism (Bird and Tobin 2015). There are alternative views, however. For instance, kinds might exist as particulars or as some special sui generis entities (Hawley and Bird 2011). This debate relates to the more general question regarding the metaphysics of properties and is not discussed further in this article. Here, the focus is on debates on natural kinds in the philosophy of science.

In the second interpretation, natural kinds exist independently of mind in the sense that there are divisions in nature that obtain independently of our classificatory practices. The assumption is that the world is structured in such a way that certain ways of classifying it, or carving it up, are correct solely in virtue of that structure. This view has been called weak realism about natural kinds, or naturalism (Bird and Tobin 2015). Weak realism or naturalism seems to be consistent with natural kinds nominalism. Even though weak realists hold that there is a metaphysical difference between natural and nonnatural classifications, it does not automatically follow that this difference needs to be spelled out in terms of a special ontological category of natural kinds. In what follows, the term natural kinds realism refers to weak realism or naturalism.

This approach to natural kinds has recently been called a zooming-in model (Reydon 2016) because it assumes that a careful examination of nature—“zooming in” to it—will lead to the discovery of mind-independent groupings. In this view, natural kinds are found in nature and not created by us. The next section starts by examining the difference between scientific realism and natural kinds realism. It then looks at how the three most prominent accounts of natural kinds discussed in the previous section can be interpreted as realist views of kinds. The analysis starts with essentialism as the strongest and most typical realist view. It then reviews cluster kinds and promiscuous kinds. These are commonly considered realist views, but, as this discussion shows, can potentially be interpreted in the antirealist vein. After that, the section offers a taxonomy of antirealist views, starting with strong versions and continuing with more moderate ones, where the difference between realism and antirealism is much subtler.

b. The Relationship Between Scientific Realism and Natural Kinds Realism

To say that entities that are being classified by our scientific theories exist independently of us and our classificatory practices is one formulation of the thesis of scientific realism (see the article on Scientific Realism and Antirealism).  An interesting question in the debate on natural kinds realism is how to formulate this idea and what its relation to scientific realism is. Some authors presuppose that scientific realism and natural kinds realism amount to the same thesis. Stathis Psillos, for example, states that the metaphysical thesis of scientific realism is committed to the claim that the “world has a definite and mind-independent natural-kind structure” (Psillos 1999, xvii). Bird and Tobin similarly claim that “it is a corollary of scientific realism that when all goes well the classifications and taxonomies employed by science correspond to the real kinds in nature” (Bird and Tobin 2008, introduction). Conceptually, however, it appears that scientific realism can be kept distinct from realism about natural kinds. The claims of the existence of certain entities, which are members of natural kinds—say, electrons—can be interpreted both as saying that there are mind-independent entities with certain properties, as described by the scientific theory, and as a stronger claim that there is an objective, mind-independent criterion for how to categorize those entities into the kind electron.

Scientific realism refers, at a minimum, to the idea that science investigates facts about entities, their properties, and the relations in which they stand that are objective or mind independent. Natural kinds realism can then be read as a further thesis, according to which, in addition to the existence of mind-independent entities and processes, certain structure(s) of kinds of entities and the criteria by which we group and individuate them are equally mind independent (Chakravartty 2011). That is, there are correct ways of categorizing the world that reflect this mind-independent natural kind structure.

c. Natural Kinds Realism

Essentialism is a paradigmatically realist view because it holds that the sources of similarities between members of a kind are intrinsic and independent of circumstances or our cognitive practices or interests (Ellis 1999). Even if the essentialism in question is pluralistic and allows for many crosscutting categorizations, it is nonetheless the fact that entities grouped together share an essence that makes kinds natural. On the other hand, antirealist or conventionalist views hold that we do not have access to the supposed real divisions in nature, or real essences of kinds, and, hence, we decide where to draw the boundaries between different kinds according to our interests and aims. Invoking our interests and aims as relevant for establishing a category as a natural kind is thus more akin to antirealist views. Both cluster approaches and promiscuous realism are commonly considered to be realist views, however, though they do invoke our aims and interests as relevant.

There are at least two strategies for accommodating the idea that a theory that invokes our aims and interests as relevant for determining which kinds are natural can still be considered realist. They both rely on arguments that aim to show that classifications that serve our interests and aims are exactly those that capture preexisting mind-independent divisions in nature. A cluster kind realist can invoke a version of the no-miracles argument (see the article on Scientific Realism and Antirealism) and argue that the fact that certain categories are successful in inductive inferences, predictions, and explanations gives us reason to conclude that they reflect some objective divisions in nature. The argument is that it would be a miracle that our inductive practices work if they do not latch onto some categories—natural kinds—that are objective. An objection to this no-miracle argument could be that it does not prove the mind-independence of the categories that we use in inductive inferences because the success of those inferences is similarly measured relative to how well they satisfy our interests and aims.

To this objection, a cluster kind realist can reply that some clusters of properties will be identified no matter what interest and aims one starts with, and that such clusters represent natural kinds. Matthew Slater, for instance, indicates that “[p]erhaps there are some clusters of properties such that no matter how a discipline adjusted its norms and aims… the category that cluster described would be fit to play a robust epistemic role in the discipline” (Slater 2014, 406). In this strategy, the kinds we take to be natural do, in a sense, depend on our aims and interests because, were it not for those aims and interests, we would not reach those classifications. What justifies taking such classifications to be natural kinds, however, is the fact that other people, starting with different aims and interests, would also reach similar classifications. The main problem with this kind of defense of natural kinds realism is that it might end up with a very small set of categorizations that qualify as natural kinds, since there seem to be many categorizations that would not be recognized by people starting with very different interests and aims. Our interest in accounting for certain material transformations brought us, for example, to classification by chemical elements according to atomic structure. But if we are interested in patterns of radioactive decay, we will arrive at different classifications, and if, hypothetically, we were interested only in the behavior of materials in centrifuges, we would arrive at classifications based on density (Franklin-Hall 2015).

Another, less demanding strategy is to treat natural kinds as domain dependent. P.D. Magnus (2012) has explicitly defended this view, but Boyd (1991) and Khalidi (2013) also seem to endorse it. The idea here is not that any rational inquirer with any type of interest will, or ought to, reach the same classifications, but rather that the realism in question amounts to the claim that classifications are natural relative to the domain of inquiry. That is, what is required is that inquirers with the same interests and aims arrive at the same classifications. This, from the viewpoint of many natural kind realists, ensures that our categorizations track the causal structure of the world. There is no disinterested point of view that will discover the real natural kinds. Rather, there are many different points of view, and what makes a grouping natural is that, when we fix what we are interested in, we also fix the correct ways to classify a domain of investigation according to those interests. Thus, even though our interests play an important role in identifying natural kinds, once we have fixed them, there are still correct and incorrect ways to classify the domain in question, and what determines this are the features of the entities being classified.

Promiscuous realism, as the name suggests, is a realist view because it takes as natural those classifications whose members share at least some (or one) common property, which is an objective and mind-independent fact. This view, even though it is extremely permissive and allows a vast range of classifications to be considered natural kinds, excludes at least some classifications. The fact that the excluded ones are not natural kinds is true by virtue of mind-independent facts; that is, in virtue of the fact that they do not share any common properties. This is a very weak version of realism because it merely captures the fact that we cannot group together entirely arbitrary collections of objects. It does not, however, offer any further realist criteria for privileging certain classifications over others. Further refinements of the promiscuous realist view seem to rely on invoking antirealist criteria, for example, by putting constraints on relevant interests and aims in scientific classifications.

d. Natural Kinds Antirealism

Antirealism, or conventionalism, as it has been called, encompasses a wide range of views. All of them have in common the claim that what determines which kinds are natural are not only mind-independent facts about the world, but also facts about “us,” the cognizers or researchers. It is important to emphasize that antirealism, on this reading, is not committed to the further thesis that the identity of natural kinds, or the criterion for what makes a kind natural, is fully mind dependent. That view, according to which natural kinds are fully mind dependent and the world does not constrain our classifications, would then represent the most extreme variety of antirealism, which has been called strong conventionalism (Bird and Tobin 2015).

According to strong conventionalism, not only are there no mind-independent facts about which groupings are natural, but, also, all the differences and similarities between different entities are entirely dependent on us. Thus, any common properties among members of a kind that we identify as the basis for grouping them together are products of our classificatory practices and do not exist independently of mind. This view might go hand in hand with the more general antirealist view regarding the existence of a mind-independent world. This view sees natural kinds as exclusively those categorizations that we use in our classificatory practice. Since there is nothing about the world that would sanction some groupings as opposed to others, natural kinds would depend on our explicit beliefs about what kinds exist (see Franklin-Hall 2015). In this view, therefore, kinds are entirely subjective. But this view has counterintuitive consequences: It would make categories such as witches or hysteria equally as legitimate as scientific categories that include electrons or species, depending on the circumstances in which they are used.

Another antirealist view claims that our ignorance or lack of access to the natural principles of classification, if they exist, leads us to conclude that our grouping of objects into a natural kind will depend, at least partly, on our interests, aims, and cognitive capacities. This view can take at least two possible forms. One is that we do not have access to the real essences of kinds—there are natural principles of classification, but they are inaccessible to us. Another is the argument that there are no clear divisions in nature, or no discoverable natural principles of classification. Rather, there are only continuous gradations between different kinds of things, so it is partly up to us where to draw the line. This implies that our epistemic aims, cognitive capacities, and practical interests might play a role in deciding where to draw such lines and what classifications to endorse. This type of view has been called weak conventionalism (Bird and Tobin 2015). It is characterized by the claim that both the causal structure of the world—mind-independent facts—and facts about us jointly determine which categorizations we will consider to be natural.

The main thesis of weak conventionalism is nicely illustrated by Reydon’s (2016) co-creation model of natural kinds. In this model, kinds are taken to be co-determined by both states of affairs in nature and the background assumptions and decisions of investigators in specific scientific contexts. It can encompass a broad set of views. Depending on how exactly one thinks of cognitive capacities, epistemic or practical aims, and what kinds of interests are taken as legitimate, different antirealist views can be developed.

A simple pragmatist approach to natural kinds, as defined by Laura Franklin-Hall (2015), holds that natural kinds correspond to categories that fulfill some of our epistemic and/or practical aims. This is a very broad understanding of natural kinds that allows a wide range of categories to be considered natural kinds. It does, however, exclude entirely arbitrary categories. It excludes them because they cannot serve any useful purpose. On the other hand, it allows categories such as proteins, gluten-free food, or introverts to be natural kinds, since they fulfill some of our interests. While in the practical sense this account would deem as natural most, if not all, of the same groupings as promiscuous realism, it is important to notice that the reason these accounts hold that certain groupings are natural is different. While the realist stresses that the reason the grouping is useful is in the fact that certain objective properties are shared, the antirealist does not care about that. The antirealist focuses instead on whether the grouping is useful and serves some purpose, regardless of whether it is based on some objective property. A simple pragmatist view countenances as natural all the groupings that are in some way relevant to us.

One problem with this view arises when we start thinking about the possibility of our interests being somewhat different than they are. This commits the view to a potentially awkward consequence, in which any change in our interests entails the existence of different natural kinds. To resist this consequence, a pragmatist can offer a way to refine which interests can be taken as relevant for judging whether a kind is natural. For instance, one can restrict the possible range of interests by considering what interests some idealized and fully informed agent or inquirer would have or would endorse.

Another potentially problematic consequence of the simple pragmatist view is that practical issues can outweigh factual ones when it comes to deciding which classifications to adopt. The psychiatric classification antisocial personality disorder, for example, most likely groups together a very heterogeneous class of people whose only common feature is that they engage in some sort of criminal behavior (Brazil, et al. 2016). From the point of view of scientific research, we should strive to find classifications that are better grounded in commonalities that their members share.  From a practical point of view, however, it might be enough to know only that people belonging to this group have committed crimes and that it is likely that they will do so again in the future (Brzović et al. 2018; Malatesti and McMillan 2014).

A more common variation on the weak conventionalist view focuses on our epistemic interests and has been called the simple epistemic view (Franklin-Hall 2015).  In this approach, natural kinds correspond to categories that fulfill some of our epistemic aims. It differs from the simple pragmatist view by excluding practical interests as relevant for circumscribing natural kinds. Cluster kinds, for instance, can have realist and antirealist readings, depending on what one focuses on. The realist would say that what makes such kinds natural is the fact that they track real clusters of properties in the world, while the simple epistemic antirealist would argue that what makes them natural is their success in fulfilling our epistemic aims, such as being predictive and explanatory. We therefore do not start by looking for clustered properties, but by looking for categories that successfully fulfill our epistemic aims. In many cases, categories based on clustered properties will accomplish this aim. In this reading, the aim of our scientific endeavors is to develop the most accurate descriptions of the world we live in, and the categories that best serve this purpose ought to be considered natural kinds. Such views have been characterized as epistemology-oriented approaches to natural kinds (Reydon 2009).

The main difficulty with these approaches is to explain how to circumscribe the set of epistemic aims that we take to be relevant in establishing which groupings correspond to natural kinds. If we take our present aims as relevant, this has the welcome consequence that our present successful scientific categories come out as natural kinds. However, we might exclude some classifications that we might reach if our interests were to change or if our knowledge expanded or got revised. To solve this type of problem, Franklin-Hall (2015) offers a more elaborate antirealist approach, the categorical bottleneck view. She identifies natural kinds with categories that fulfill the interests that we and a wider range of epistemic agents with different interests and cognitive capacities have in common. Here, however, the relevant interests are limited to what we and our “neighboring agents” would recognize as scientifically relevant classifications. That is, we do not consider any possible epistemic agents apart from those that are relatively like us. Neighboring agents are those that only somewhat differ from actual agents in their epistemic aims and interests. This restriction of possible interests is meant to ensure more objectivity for natural kind categories by eliminating those that might be contingent on some of our cognitive capacities or limitations in knowledge.

Thinking about the synchronic and diachronic ways of considering naturalness illustrates the problem that antirealist views face—that classifications based on our interests can seem to lack objectivity (Chang 2016). The synchronic aspect looks at a specific moment of scientific development, usually the one that is of immediate interest, and examines whether the scientific categories that are in use can be considered natural kinds. Examples include whether they play an important role in scientific explanations, whether they are predictive, whether they figure in scientific laws or lawlike generalizations, or even whether they fulfill certain practical purposes. If we only focus on the present moment, we might be tempted to conclude that natural kinds are those categories that fulfill our present epistemic interests and aims. If we concentrate on the diachronic aspect of naturalness and investigate what it means that a category is a natural kind throughout different periods of scientific development, then it might turn out not to be beneficial to focus on our present interests and aims, since there is always the possibility that they might change as new information comes in and new scientific theories are accepted.

Thinking about the question of what makes a kind natural across different stages of scientific development can be used to illustrate the way different antirealist positions can reach different conclusions. One reading of the simple epistemic view claims that, throughout the development of science, different categories have served our interests, and, for this reason, we can consider them to be natural kinds in the contexts in which they were used. The consequence of this view is that natural kinds are relative to the context of scientific investigation. In this view, categorizations such as phlogiston or hysteria, for example, were natural kinds in one historical period but not in others.

A different reading of the simple epistemic view argues that only our present categories correspond to natural kinds, while the ones that served our interests in previous stages of scientific development were not natural kinds if they differed from the present ones. This has the problematic consequence, however, that in the future we might develop different epistemic aims and interests, and that other categories would therefore come to be considered scientifically grounded. We thus could not consider them natural kinds, since the notion of natural kinds is tied to our present interests. One option for the simple epistemic view is to argue that our interests and aims do not change to a large degree, but rather it is our factual knowledge regarding how to fulfill our aims that changes. The claim that natural kinds are categories that best serve our epistemic aims and interests thus presupposes that they are best served when we have all the required information regarding matters of fact. Thus, while it might appear that our aims change substantively with the development of science, what actually changes is our access to information on how to fulfil them. Another option for the antirealist is to abandon the simple epistemic view and embrace something more akin to the categorical bottleneck view, which ensures more objectivity for natural kinds by introducing a wider range of epistemic agents with different interests and cognitive capacities.

Domain-dependent realism, which is closest to antirealist views in the sense that it makes natural kinds relative to different domains of inquiry, solves this problem by construing natural kinds as categories that we would adopt if starting with the same interests. The idea here is not merely that any category that is useful in certain contexts is a natural kind, but rather that, once we start with certain fixed interests, there is a correct way to classify the domain of inquiry. Thus, it is not enough that certain classifications fulfill our interests, because even the categorizations we now take to be flawed still fulfill our interests to a certain extent. Rather, we should aim at finding the correct ones within our domain of interest. Consequently, there is a possibility that our current classifications are not natural kinds, because it might turn out that there are better, or more refined ones, which will more perfectly fulfill our interests. That is, we can always assume that further scientific developments and new data will lead us to reconsider our current classifications. This is a feature of realist views in general, that we cannot be certain that our current knowledge reflects the real states of affairs or, in this case, that our classifications reflect natural kinds.

These problems nicely illustrate the main benefits and drawbacks of realist and antirealist views. We start out with the intuition that natural kinds pick out some objective features of the world and that what kinds are natural is not supposed to change across different contexts. Realism easily accounts for this objectivity by arguing that natural kinds represent the way the world is structured independently of us. Thus, kinds are out there to be discovered, and they cannot change across different scientific contexts or vary with different researchers’ interests. The problem for the realist, then, is to demonstrate how it is possible to access such natural groupings, to locate nature’s joints. The realist has to offer something like essences or very clearly delineated clusters of properties and to try to convince skeptics that these really are the natural ways to divide the entities in the world. Realist positions are characterized by their openness to the possibility that our current categories do not actually capture natural kinds. Even domain-dependent realists can always question whether—if there is a convergence on a certain classification by everyone sharing the same interest, that is, working in the same discipline—we might, with future scientific developments, discover new facts that will lead us to reexamine those classifications.

Antirealism, on the other hand, gives weight to the researchers’ contribution to scientific classification, but at the cost of sacrificing the objectivity of kinds. In this account, natural kinds can be seen as relative to the specific contexts of investigation. This has the consequence that the kinds that are deemed natural will change as the scientific research advances. Thus, while hysteria, to take an example cited previously, was at one point a natural kind, that is no longer the case. To avoid this consequence, the antirealist can offer a way to sanction possible interests and aims to arrive at a more objective view of natural kinds. Such sanctioning, however, naturally lead us to postulate objective features of the world that our classifications ought to identify. These realist intuitions again lead us away from the starting ambition to encompass actual scientific classifications.

4. Conclusion

The growth of interest in natural kinds among philosophers of science stems from two sources. One relates to debates on scientific confirmation and inductive reasoning; the other has emerged from debates regarding the reference of scientific terms. With the further development of the philosophies of specific scientific disciplines such as biology, chemistry, psychology, psychiatry, and so on, theorizing about natural kinds moved more in the direction of examining successful scientific classifications and offering philosophical accounts that should capture those classifications. In this regard, we can identify two main approaches to the natural kinds debate and the corresponding roles they are supposed to play: on the one hand, a traditional, more prescriptive one; on the other, a descriptive one that aims to stay close to scientific practice.

This move is transparent in the three major approaches to natural kinds presented in this article. On one hand, essentialism, with its strict search for clearly demarcated kinds, has been criticized as being too restrictive, because it leaves out many important scientific categorizations. On the other, the cluster kind and promiscuous realism approaches have been worked out with the aim of providing a framework that will capture classifications in actual scientific practice. This tendency is effective insofar as it brings philosophical accounts closer to science. However, it risks minimizing the prescriptive role that natural kinds should play in scientific research, because philosophers using this approach tend to equate current scientific classifications with natural kinds. In the debate on the metaphysics of natural kinds, the dichotomy between these two approaches is reflected in a tension between attempts to ensure the objectivity of natural kinds and attempts to stay close to scientific practice by emphasizing that natural kinds ought to fulfill our current interests.

5. References and Further Reading

  • Bird, Alexander. Nature’s Metaphysics: Laws and Properties. Oxford University Press, 2007.
  • Bird, Alexander, and Emma Tobin. “Natural Kinds.The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Stanford University, Spring 2015.
  • Bolton, Derek. “Classification and Causal Mechanisms: A Deflationary Approach to the Classification Problem.” In Philosophical Issues in Psychiatry II: Nosology, edited by K. S. Kendler and J. Parnas, Oxford University Press, 2012, pp. 6–12.
  • Boyd, Richard. “Homeostasis, Species, and Higher Taxa.” In Species: New Interdisciplinary Essays, edited by R. A. Wilson, MIT Press, 1999, pp. 141–185.
  • Boyd, Richard. “Realism, Anti-Foundationalism and the Enthusiasm for Natural Kinds.” Philosophical Studies, vol. 61, no. 1, 1999, pp. 127–148.
  • Brazil, Inti A., J. D. M. van Dongen, J. H. R. Maes, R. B. Mars, and Arielle R. Baskin-Sommers. “Classification and Treatment of Antisocial Individuals: From Behavior to Biocognition.” Neuroscience & Biobehavioral Reviews, 2016.
  • Brzović, Zdenka, Jurjako, Marko, and Predrag Šustar. “The Kindness of Psychopaths.” International Studies in the Philosophy of Science, vol. 31, no. 2, 2017, pp. 189-211.
  • Buckholtz, Joshua W., and A. Meyer-Lindenberg. “Psychopathology and the Human Connectome: Toward a Transdiagnostic Model of Risk for Mental Illness.” Neuron, vol. 74, no. 6, 2012, pp. 990–1004.
  • Chakravartty, Anjan. A Metaphysics for Scientific Realism: Knowing the Unobservable. Cambridge University Press, 2007.
  • Chakravartty, Anjan. “Scientific Realism and Ontological Relativity.” The Monist, vol. 94, no. 2, 2011, pp. 157–180.
  • Chang, H. “The Rising of Chemical Natural Kinds through Epistemic Iteration.” Natural Kinds and Classification in Scientific Practice, edited by C. Kendig, Routledge, 2016, pp. 33–47.
  • Cooper, Rachel. “Is Psychiatric Classification a Good Thing?” Philosophical Issues in Psychiatry II: Nosology, edited by K.S. Kendler and J. Parnas, Oxford University Press, 2012, pp. 61–70.
  • Dupré, John A. “Natural Kinds and Biological Taxa.” The Philosophical Review, vol. 90, no. 1, 1981, pp. 66–90.
  • Ellis, Brian. “Essentialism and Natural Kinds.” The Routledge Companion to Philosophy of Science, edited by M. Curd and S. Psillos, Routledge, 1999, pp. 139–149.
  • Ellis, Brian. Scientific Essentialism. Cambridge University Press, 2001.
  • Ereshefsky, Marc. “Species.” The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Stanford University, 2016. https://plato.stanford.edu/archives/sum2016/entries/species.
  • Ereshefsky, Marc, and Thomas A. C. Reydon. “Scientific kinds.” Philosophical Studies, vol. 172, no. 4, 2015, pp. 969–986.
  • Fodor, J. “Special Sciences (Or: The Disunity of Science as a Working Hypothesis).” Synthese, vol. 28, no. 2, 1974, pp. 97–115.
  • Franklin-Hall, Laura. “Natural Kinds as Categorical Bottlenecks.” Philosophical Studies, vol. 172, no. 4, 2015, pp. 925–948.
  • Ghiselin, M. “A Radical Solution to the Species Problem.” Systematic Zoology, vol. 23, no. 4, 1974, pp. 536–544.
  • Goodman, Nelson. Fact, Fiction and Forecast. Harvard University Press, 1983.
  • Hawley, K, and A. Bird. “What are Natural Kinds?” Philosophical Perspectives, vol. 25 no. 1, 2011, pp. 205–221.
  • Hendry, Robin F. “Elements, Compounds and Other Chemical Kinds.” Philosophy of Science, vol. 73, no. 5, 2006, pp. 864–875.
  • Hendry, Robin F. 2010. “The elements and conceptual change.” In The Semantics and Metaphysics of Natural, edited by H. Beebee and N. Sabbarton-Leary, Routledge, pp. 137–158.
  • Hull, David L. “A Matter of Individuality.” Philosophy of Science, vol. 45, no. 3, 1978, pp. 335–360.
  • Hull, David L. “The Effect of Essentialism on Taxonomy: Two Thousand Years of Stasis.” British Journal for the Philosophy of Science, vol. 15, no. 60, 1965, pp. 314–326.
  • Kendig, Catherine, editor. Natural Kinds and Classification in Scientific Practice. 1st ed., Routledge, 2015.
  • Khalidi, Muhammad Ali. Natural Categories and Human Kinds: Classification in the Natural and Social Sciences. Cambridge University Press, 2013.
  • Kitcher, Philip. “Species.” Philosophy of Science, vol. 51, no. 2, 1984, pp. 308–333.
  • Kripke, Saul. “Naming and Necessity.” Semantics of Natural Language, edited by G. Harman and D. Davidson, Reidel, 1972, pp. 253–355.
  • LaPorte, Joseph. Natural Kinds and Conceptual Change. Cambridge University Press, 2003.
  • Magnus, P. D. Scientific Enquiry and Natural Kinds: From Planets to Mallards. Palgrave Macmillan, 2012.
  • Malatesti, Luca, and John McMillan. “Defending Psychopathy: An Argument from Values and Moral Responsibility.” Theoretical Medicine and Bioethics, vol. 35, no. 1, 2014, pp. 7–16.
  • Murphy, Dominic. “Can Psychiatry Refurnish the Mind?” Philosophical Explorations, vol. 20, no. 2, 2017, pp. 160–174.
  • Plato. Phaedrus. Cambridge University Press, 1952.
  • Psillos, Stathis. Scientific Realism: How Science Tracks Truth. Routledge, 1999.
  • Putnam, Hilary. “The Meaning of ‘Meaning.’” Minnesota Studies in the Philosophy of Science, vol. 7, 1975, pp. 215–271.
  • Quine, Willard van Orman. “Natural Kinds.” Ontological Relativity and Other Essays, edited by W. V. Quine, Columbia University Press, 2012, pp. 114–138.
  • Reydon, Thomas. “From a Zooming-In Model to a Co-creation Model: Towards a more Dynamic Account of Classification and Kinds.” Natural Kinds and Classification in Scientific Practice, edited by C. E. Kendig, Routledge, 2016, pp. 59–73.
  • Reydon, Thomas. 2009. “How to Fix Kind Membership: A Problem for HPC Theory and a Solution.” Philosophy of Science, vol. 76, no. 5, 2009, pp. 724–736.
  • Slater, Matthew H. 2009. “Macromolecular Pluralism.” Philosophy of Science, vol. 76, no. 5, 2009, pp. 851–63.
  • Slater, Matthew. “Natural Kindness.” British Journal for the Philosophy of Science, vol. 66, no. 2, 2015, pp. 375–411.
  • Sober, Elliott. “Evolution, Population Thinking and Essentialism.” Conceptual Issues in Evolutionary Biology, edited by Elliott Sober, MIT Press, 1994, pp. 161–189.
  • Wilson, R. A., M. J. Barker, and I. Brigandt. “When Traditional Essentialism Fails: Biological     Natural Kinds.” Philosophical Topics, vol. 35, no. 1/2, 2007, pp. 189–215.

 

Author Information

Zdenka Brzović
Email: zbrzovic@gmail.com
University of Rijeka
Croatia

Niccolò Machiavelli (1469—1527)

MachiavelliMachiavelli was a 16th century Florentine philosopher known primarily for his political ideas. His two most famous philosophical books, The Prince and the Discourses on Livy, were published after his death. His philosophical legacy remains enigmatic, but that result should not be surprising for a thinker who understood the necessity to work sometimes from the shadows. There is still no settled scholarly opinion with respect to almost any facet of Machiavelli’s philosophy. Philosophers disagree concerning his overall intention, the status of his sincerity, the status of his piety, the unity of his works, and the content of his teaching.

His influence has been enormous. Arguably no philosopher since antiquity, with the possible exception of Kant, has affected his successors so deeply. Indeed, the very list of these successors reads almost as if it were the history of modern political philosophy itself. Bacon, Descartes, Spinoza, Bayle, Hobbes, Locke, Rousseau, Hume, Smith, Montesquieu, Fichte, Hegel, Marx, and Nietzsche number among those whose ideas ring with the echo of Machiavelli’s thought. Even those who apparently rejected the foundations of his philosophy, such as Montaigne, typically regarded Machiavelli as a formidable opponent and deemed it necessary to engage with the implications of that philosophy.

Table of Contents

  1. Life
    1. The Youth (1469-1498)
    2. The Official (1498-1512)
    3. The Philosopher (1513-1527)
  2. Philosophical Themes
    1. Virtue
    2. Fortune
    3. Nature
    4. History and Necessity
    5. Truth
    6. Politics: The Humors
    7. Politics: Republicanism
    8. Glory
    9. Religion
    10. Ethics
  3. Machiavelli’s Corpus
    1. The Prince
    2. Discourses on Livy
    3. Art of War
    4. Florentine Histories
    5. Other Works
  4. Possible Philosophical Influences on Machiavelli
    1. Renaissance Humanism
    2. Renaissance Platonism
    3. Renaissance Aristotelianism
    4. Xenophon
    5. Lucretius
    6. Savonarola
    7. The Bible and Its Traditions
  5. Contemporary Interpretations
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

It is customary to divide Machiavelli’s life into three periods: his youth; his work for the Florentine republic; and his later years, during which he composed his most important philosophical writings.

Most of Machiavelli’s diplomatic and philosophical career was bookended by two important political events: the French invasion of Italy in 1494 by Charles VIII; and the sack of Rome in 1527 by the army of Emperor Charles V.

In what follows, citations to The Prince refer to chapter number (e.g., “P 17”). Citations to the Discourses and to the Florentine Histories refer to book and chapter number (e.g., “D 3.1” and “FH 4.26”). Citations to the Art of War refer to book and sentence number in the Italian edition of Marchand, Farchard, and Masi and in the corresponding translation of Lynch (e.g., “AW 1.64”).

a. The Youth (1469-1498)

Machiavelli was born on May 3, 1469, to a somewhat distinguished family. He grew up in the Santo Spirito district of Florence. He had three siblings: Primavera, Margherita, and Totto. His mother was Bartolomea di Stefano Nelli. His father was Bernardo, a doctor of law who spent a considerable part of his meager income on books and who seems to have been especially enamored of Cicero. So, at a young age, Machiavelli was exposed to many classical authors who influenced him profoundly; as he says in the Discourses, the things that shape a boy of “tender years” will ever afterward regulate his conduct (D 3.46). We do not know whether Machiavelli read Greek, but he certainly read Greek authors in translation, such as Thucydides, Plato, Xenophon, Aristotle, Polybius, Plutarch, and Ptolemy. He was studying Latin already by age seven and translating vernacular works into Latin by age twelve. Among the Latin authors that he read were Plautus, Terence, Caesar, Cicero, Sallust, Virgil, Lucretius, Tibullus, Ovid, Seneca, Tacitus, Priscian, Macrobius, and Livy. Among Machiavelli’s favorite Italian authors were Dante and Petrarch.

When he was twelve, Machiavelli began to study under the priest Paolo da Ronciglione, a famous teacher who instructed many prominent humanists. Machiavelli may have studied later under Marcello di Virgilio Adriani, a professor at the University of Florence.

The diaries of Machiavelli’s father end in 1487. For the next ten years, there is no record of Machiavelli’s activities. In 1497, he returns to the historical record by writing two letters in a dispute with the Pazzi family.

During this period, there were many important dates during this period. The Pazzi conspiracy against the Medici occurred in 1478. Savonarola began to preach in Florence in 1482. In 1492, Lorenzo the Magnificent died and Rodrigo Borgia ascended to the papacy as Alexander VI. In 1490, after preaching elsewhere for several years, Savonarola returned to Florence and was assigned to San Marco. In 1494, he gained authority in Florence when the Medici were expelled in the aftermath of the invasion of Charles VIII. Machiavelli’s mother passed away in 1496, the same year that Savonarola would urge the creation of the Great Council. On May 12, 1497, Savonarola was excommunicated by Alexander VI. On May 23, 1498, almost exactly a year later, he was hung and then burned at the stake with two other friars in the Piazza della Signoria.

b. The Official (1498-1512)

Not long after Savonarola was put to death, Machiavelli was appointed to serve under Adriani as head of the Second Chancery. Machiavelli was 29 and had no prior political experience. A month after he was appointed to the Chancery, he was also appointed to serve as Secretary to the Ten, the committee on war.

In November 1498 he undertook his first diplomatic assignment, which involved a brief trip to the city of Piombino. In March 1499, he was sent to Pontedera to negotiate a pay dispute involving the mercenary captain, Jacopo d’Appiano. In July of the same year, he would visit Countess Caterina Sforza at Forli (P 3, 6, and 20; D 3.6; FH 7.22 and 8.34; AW 7.27 and 7.31).

His first major mission was to the French court, from July 1500 to January 1501. There he would meet Georges d’Amboise, the cardinal of Rouen and Louis XII’s finance minister (P 3). In 1501, he would take three trips to the city of Pistoia, which was being torn to pieces by factional disputes (P 17). Over the next decade, he would undertake many other missions, some of which kept him away from home for months (e.g., his 1507 mission to Germany).

In August 1501 he was married to Marietta di Ludovico Corsini. Machiavelli and Marietta would eventually have several children, including Bernardo, Primerana (who died young), an unnamed daughter (who also died young), Baccina, Ludovico, Piero, Guido, and Totto. Machiavelli was also romantically linked to other women, such as the courtesan La Riccia and the singer Barbera Salutati.

In 1502, Machiavelli met Cesare Borgia for the first time (e.g., P 3, 7, 8, and 17; D 2.24). In the same year, Florence underwent a major constitutional reform, which would place Piero Soderini as gonfaloniere for life (previously the term limit had been two months). Soderini (e.g., D 1.7, 1.52, 1.56, 3.3, 3.9, and 3.30) allowed Machiavelli to create a Florentine militia in 1505-1506. The militia was an idea that Machiavelli had promoted so that Florence would not have to rely upon foreign or mercenary troops (see P 12 and 13). In 1507, Machiavelli would be appointed to serve as chancellor to the newly created Nine, a committee concerning the militia.

Between 1502 and 1507, Machiavelli would collaborate with Leonardo da Vinci on various projects. The most notable was an attempt to connect the Arno River to the sea; to irrigate the Arno valley; and to cut off the water supply to Pisa.

In the summer of 1512, Machiavelli’s militia was crushed at the city of Prato. Soderini was exiled, and by September 1 Giuliano de’ Medici would march into Florence to reestablish Medici control of the city. Machiavelli’s tenure for the Florentine government would last from June 19, 1498 to November 7, 1512. He was one of the few officials from the republic to be dismissed upon the return of the Medici.

During this period, Cesare Borgia became the Duke of Valentinois in the late summer of 1498. Machiavelli’s father, Bernardo, died in 1500. Alexander VI died in August 1503 and was replaced by Pius III (who lasted less than a month). Julius II would ascend to the papacy later in November 1503.

c. The Philosopher (1513-1527)

In late 1512, Machiavelli was accused of participating in an anti-Medici conspiracy. In early 1513, he was imprisoned for twenty-two days and tortured with the strappado, a method that painfully dislocated the shoulders. He was released in March and retired to a family house (which still stands) in Sant’Andrea in Percussina.

It was a profound fall from grace, and Machiavelli felt it keenly; he complains of his “malignity of fortune” in the Dedicatory Letter to The Prince. He seems to have commenced writing almost immediately. By 10 December 1513, he wrote to his friend, Francesco Vettori, that he was hard at work on what we now know as his most famous philosophical book, The Prince. He also began to write the Discourses on Livy during this period.

During the following years, Machiavelli attended literary and philosophical discussions in the gardens of the Rucellai family, the Orti Oricellari. He wrote poetry and plays during this period, and in 1518 he likely wrote his most famous play, Mandragola.

Friends such as Francesco Guicciardini and patrons such as Lorenzo di Filippo Strozzi attempted, with varying degrees of success, to restore Machiavelli’s reputation with the Medici. Something must have worked. In 1520, Machiavelli was sent on a minor diplomatic mission to Lucca, where he would write the Life of Castruccio Castracani. Impressed, Giuliano de’ Medici offered Machiavelli a position in the University of Florence as the city’s official historiographer. Giuliano would also commission the Florentine Histories (which Machiavelli would finish by 1525).

In 1520, Machiavelli published the Art of War, the only major prose work he would publish during his lifetime. It was well received in both Florence and Rome. He directed the first production of Clizia in January 1525.

Machiavelli died on June 21, 1527. His body is buried in the Florentine basilica of Santa Croce.

During this period, Giovanni de’ Medici became Pope Leo X upon the death of Julius II, in 1513. He was the first Florentine ever to become pope. In October 1517, Martin Luther sent his 95 Theses to Albert of Mainz. In 1521, Luther was excommunicated by Leo X. In 1522, Piero Soderini died in Rome. In 1523, Giuliano de’ Medici became Pope Clement VII. In 1527, Clement refused Henry VIII’s request for an annulment. Five years later, on May 6, 1527, Rome was sacked by Emperor Charles V.

2. Philosophical Themes

If to be a philosopher means to inquire without any fear of boundaries, Machiavelli is the epitome of a philosopher. Although it is unclear exactly what “reason” means for Machiavelli, he says that it is “good to reason about everything” (bene ragionare d’ogni cosa; D 1.18). And he says: “I do not judge nor shall I ever judge it to be a defect to defend any opinion with reasons, without wishing to use either authority or force for it” (D 1.58). He claims that he will not reason about certain topics but then does so, anyway (e.g., P 2, 6, 11, and 12; compare D 1.16 and 1.58). And he suggests that a prince should be a “broad questioner” (largo domandatore) and a “patient listener to the truth” (paziente auditore del vero; P 23).

But what more precisely might Machiavelli mean by “philosophy”? It is worth noting that the word “philosophy” (filosofia) never appears in The Prince or the Discourses (but see FH 7.6). The word “philosopher(s)” (filosofo / filosofi) appears once in The Prince (P 19) and three times in the Discourses (D 1.56, 2.5, and 3.12; see also D 1.4-5 and 2.12, as well as FH 5.1 and 8.29). Machiavelli occasionally refers to other philosophical predecessors (e.g., D 3.6 and 3.26; FH 5.1; and AW 1.25).

For the sake of presentation, this article presumes that The Prince and the Discourses comprise a unified Machiavellian philosophy. Readers should note that other interpreters would not make this presumption. Regardless, what follows is a series of representative themes or vignettes that could support any number of interpretations.

a. Virtue

The most fundamental of all of Machiavelli’s ideas is virtù. This word has several valences but is reliably translated in English as “virtue” (sometimes as “skill” or “excellence”). Although difficult to characterize concisely, Machiavellian virtue concerns the capacity to shape things and is a combination of self-reliance, self-assertion, self-discipline, and self-knowledge.

With respect to self-reliance, a helpful way to think of virtue is in terms of what Machiavelli calls “one’s own arms” (arme proprie; P 1 and 13; D 1.21), a notion that he links to virtue. This phrase at times refers literally to one’s soldiers or troops. But it can also refer to a general sense of what is one’s own, that is, what does not belong to or depend upon something else. Minimally, then, virtue may mean to rely upon one’s self or one’s possessions. Maximally, it may mean to disavow reliance in every sense—such as the reliance upon nature, fortune, tradition, and so on. To be virtuous might mean, then, not only to be self-reliant but also to be independent. In this way, Machiavelli is perhaps the forerunner of various modern accounts of substance (e.g., that of Descartes) that characterize the reality of a thing in terms of its independence rather than its goodness.

With respect to self-assertion, those with virtue are dynamic and restless, even relentless. Machiavellian virtue thus seems more closely related to the Greek conception of active power (dynamis) than to the Greek conception of virtue (arete). Consequently, the idiom of idleness or leisure (ozio) is foreign to most, if not all, of the successful characters in Machiavelli’s writings, who instead constantly work toward the achievement of their aims. The Romans, ostensibly one of the model republics, always look for danger from afar; fight wars immediately if it is necessary; and do not hesitate to employ fraud (P 3; D 2.13). Cesare Borgia, ostensibly one of the model princes, labors ceaselessly to lay the proper foundations for his future (P 7). Machiavelli urges his readers to think of war always, especially in times of peace (P 14); never to fail to see the oncoming storm in the midst of calm (P 24); and to beware of Fortune, who is like “one of those raging rivers” that destroys everything in its path (P 25). He laments the idleness of modern times (D 1.pr; see also FH 5.1) and encourages potential founders to ponder the wisdom of choosing a site that would force its inhabitants to work hard in order to survive (D 1.1). Machiavelli says that a wise prince should never be idle in peaceful times but should instead use his industry (industria) to resist adversity when fortune changes (P 14).

With respect to self-discipline, virtue involves a recognition of one’s limits coupled with the discipline to work within those limits. The Prince, for instance, is occasionally seen as a manual for autocrats or tyrants. But in fact it is replete with recommendations of moderation and self-discipline. Machiavelli insists, for example, that a prince should use cruelty sparingly and appropriately (P 8); that he should not seek to oppress the people (P 9); that he should not spend his subjects’ money (P 16) or take their property or women (P 17); that he should appear to merciful, faithful, honest, humane, and, above all, religious (P 18); that he should be reliable, not only as a “true friend” but as a “true enemy” (P 21); and so forth. And although Machiavelli rarely discusses justice in The Prince, he does say that “victories are never so clear that the winner does not have to have some respect [qualche respetto], especially for justice” (giustizia; P 21; see also 19 and 26). For Machiavelli, virtue includes a recognition of the restraints or limitations within which one must work: not only one’s own limits, but social ones, including conventional understandings of right and wrong.

Finally, with respect to self-knowledge, virtue involves knowing one’s capabilities and possessing the paradoxical ability to be firmly flexible. It is not enough to be constantly moving; additionally, one must always be ready and willing to move in another direction. Nor is it enough simply to recognize one’s limits; additionally, one must always be ready and willing to find ways to turn a disadvantage into an advantage. Success is never a permanent achievement. Time sweeps everything before it and brings the good as well as the bad (P 3); fortune varies and can ruin those who are obstinate (P 25). Virtue requires that we know how to be impetuous (impetuoso); that we know how to recognize fortune’s impetus (impeto); that we know how to move quickly in order to seize an opportunity before it evaporates. Virtue involves flexibility—but this is both a disciplined and an optimistic flexibility. Furthermore, it is a flexibility that exists within prudently ascertained parameters and for which we are responsible. What it means to be virtuous involves understanding ourselves and our place in the cosmos. In this way, Machiavelli’s conception of virtue is linked not only with his conception of fortune but also with necessity and nature. Furthermore, it raises the question of what it means to be wise (savio), an important term in Machiavelli’s thought.

It should be emphasized that Machiavellian virtue is not necessarily moral. At first glance and perhaps upon closer inspection, Machiavellian virtue is something like knowing when to choose virtue (as traditionally understood) and when to choose vice. As he puts it, we must learn how not to be good (P 15 and 19) or even how to enter into evil (P 18; compare D 1.52), since it is not possible to be altogether good (D 1.26). Machiavelli is sensitive to the role that moral judgment plays in political life; there would be no need to dissimulate if the opinions of others did not matter. But his point seems to be that we do not have to think of our own actions as being excellent or poor simply in terms of whether they are linked to conventional moral notions of right and wrong. Praise and blame are levied by observers, but not all observers see from the perspective of conventional morality.

Some scholars point to Machiavelli’s use of mitigating rhetorical techniques and to his reading of classical authors in order to argue that his notion of virtue is in fact much closer to the traditional account than it first appears. Crucial for this issue are the central chapters of The Prince (P 15-19). Some scholars highlight similarities between Machiavelli’s treatment of liberality and mercy in particular and the treatments of Cicero (De officiis) and Seneca (De beneficiis and De clementia). They argue that Machiavelli’s understanding of these virtues is not in principle different from the classical understanding and that Machiavelli’s concern is more with the manner in which these virtues are perceived or “held” (tenuto). Other scholars argue that these chapters of The Prince completely overturn the classical and Christian understanding of these virtues and that Machiavelli intends a new account that is actually “useful” in the world (utile; P 15). The scholarly disagreement over the status of the virtues in the central chapters of The Prince, in other words, reflects the broader disagreement concerning Machiavelli’s understanding of virtue as such.

Lastly, it is worth noting that virtù comes from the Latin virtus, which itself comes from vir or “man.” It is no accident that those without virtue are often called weak, pusillanimous, and even effeminate (effeminato)—such as the Medes, who are characterized as effeminate as the result of a long peace (P 6). Neither is it an accident that fortune, with which virtue is regularly paired and contrasted, is female (e.g., P 20 and 25).

b. Fortune

Fortuna stands alongside virtù as a core Machiavellian concept. It is reliably translated as “fortune” but it can also mean “storms at sea” in both Latin and Italian.

Machiavelli often situates virtue and fortune in tension, if not opposition. At times, he suggests that virtue can resist or even control fortune (e.g., P 25). But he also suggests that fortune cannot be opposed (e.g., D 2.30) and that it can hold down the greatest of men with its “malignity” (malignità; P Ded.Let and 7, as well as D 2.pr). Fortune accompanies good with evil and evil with good (FH 2.30). Thus, one of the most important questions to ask of Machiavelli concerns this relationship between virtue and fortune.

One way of engaging this question is to think of fortune in terms of what Machiavelli calls the “arms of others” (arme d’altri; P 1 and 12-13; D 1.43). This phrase at times refers literally to soldiers who are owned by someone else (auxiliaries) and soldiers who change masters for pay (mercenaries). But it can also refer to a general sense of what is not one’s own, that is, what belongs or depends upon something else. Minimally, then, fortune means to rely upon outside influences—such as chance or God—rather than one’s self. Maximally, it may mean to rely completely upon outside influences and, in the end, to jettison completely the idea of personal responsibility. Few scholars would argue that Machiavelli upholds the maximal position, but it remains unclear how and to what extent Machiavelli believes that we should rely upon fortune in the minimal sense.

A second way of engaging this question is to examine the ways in which Machiavelli portrays fortune. In one passage, he likens fortune to “one of those violent rivers” (uno di questi fiumi rovinosi) which, when enraged, will flood plains and uproot everything in its path (P 25). This image uses language similar to the description of successful princes in the very same chapter (as well as elsewhere, such as P 19 and 20). Three times in the Prince 25 river image, fortune is said to have “impetus” (impeto); at least eight times throughout Prince 25, successful princes are said to need “impetuosity” (impeto) or to need to be impetuous (impetuoso). This linguistic proximity might mean various things: that virtue and fortune are not as opposed as they first appear; that a virtuous prince might share (or imitate) some of fortune’s qualities; or that a virtuous prince, in controlling fortune, takes over its role.

Even more famous than the likeness to a river is Machiavelli’s identification of fortune with femininity. This characterization has important Renaissance precedents—for instance, in the work of Leon Battista Alberti, Giovanni Pontano, and Enea Silvio Piccolomini. But Machiavelli’s own version is nuanced and has long resisted easy interpretation. In The Prince, fortune is identified as female (P 20) and is later said to be a woman or perhaps a lady (una donna; P 25). This image is echoed in one of Machiavelli’s poetic works, Dell’Occasione. There he is more specific: fortune is a woman who moves quickly with her foot on a wheel and who is largely bald-headed, except for a shock of hair that covers her face and prevents her from being recognized. Finally, in his tercets on fortune in I Capitoli, Machiavelli characterizes her as a two-faced goddess who is harsh, violent, cruel, and fickle.

It is worth looking more closely at The Prince’s image of una donna, which is the most famous of the feminine images. Machiavelli makes at least two provocative claims. Firstly, he says that it is necessary to beat and strike fortune down if one wants to hold her down. This hypothetical claim is often read as if it is a misogynistic imperative or at least a recommendation. But it is worth noting that Machiavelli does not claim that it is possible to hold fortune down at all; he instead simply remarks upon what would be necessary if one had the desire to do so. Secondly, Machiavelli says that fortune allows herself to be won more by the impetuous than by those who proceed in a cold or cautious manner. Thus, she is a friend of the young, “like a woman” (come donna; now a likeness rather than an identification). Here, too, it is worth noting that the emphasis concerns the agency of fortune. She is not conquered. Rather, she relents; she allows herself to be won. It is far from clear that the young men who come to her manage to subdue her in any meaningful way, with the implication being that it is not possible to do so without her consent.

On this point, it is also worth noting that recent work has increasingly explored Machiavelli’s portrayal of women. Although Machiavelli in at least one place discusses how a state is “ruined” because of women (D 3.26), he also seems to allow for the possibility of a female prince. The most notable ancient example is Dido, the founder and first queen of Carthage (P 20 and D 2.8). The most notable modern example is Caterina Sforza, who is called “Countess” six times (P 20; D 3.6; FH 8.34 [2x, but compare FH 7.22]; and AW 7.27 and 7.31) and “Madonna” twice (P 3 and D 3.6). Other possibilities include women who operate more indirectly, such Epicharis and Marcia—the respective mistresses of Nero and Commodus (D 3.6). In other words, Machiavelli seems to allow for the possibility of women who act virtuously, that is, who adopt manly characteristics. It may be that a problem with certain male, would-be princes is that they do not know how to adopt feminine characteristics, such as the fickleness or impetuosity of Fortune (e.g., P 25).

A third way of engaging the question of fortune’s role in Machiavelli’s philosophy is to look at what fortune does. One of fortune’s most important roles is supplying opportunity (e.g., P 6 and 20, as well as D 1.10 and D 2.pr). Even the most excellent and virtuous men appear to require the opportunity to display themselves. Figures as great as Moses, Romulus, Cyrus, and Theseus are no exception (P 6), nor is the quasi-mythical redeemer whom Machiavelli summons in order to save Italy (P 26). They all require the situation to be amenable: for a people to be weak or dispersed; for a province to be disunited; and so forth. However, some scholars have sought to deflate the role of fortune here by pointing to the meager basis of many opportunities (e.g., that of Romulus) and by emphasizing Machiavelli’s suggestion that one can create one’s own opportunities (P 20 and 26).

It is worth noting that Machiavelli writes on ingratitude, fortune, ambition, and opportunity in I Capitoli; notably, he omits a treatment of virtue. This pregnant silence may suggest that Machiavelli eventually came to see fortune, and not virtue, as the preeminent force in human affairs. In The Prince, he says: “I judge that it might be true” (iudico potere essere vero) that fortune governs half our actions and leaves the other half, or “close to it,” for us to govern (P 25; compare FH 7.21 and 8.36). But surely here Machiavelli is encouraging, even imploring us to ask whether it might not be true.

c. Nature

What Machiavelli means by “nature” is unclear. At times, it seems related to instability, as when he says that the nature of peoples is variable (P 6); that it is possible to change one’s nature with the times (P 25; D 1.40, 1.41, 1.58, 2.3, and 3.39); that worldly things by nature are variable and always in motion (P 10 and FH 5.1; compare P 25); that human things are always in motion (D 1.6 and 2.pr); and that all things are of finite duration (D 3.1). Elsewhere, it seems related to stability, as when he says that human nature is the same over time (e.g., D 1.pr, 1.11, and 3.43). At least once Machiavelli speaks of “natural things” (cose della natura; P 7); at least twice he associates nature with God (via spokesmen; see FH 3.13 and 4.16). In the only chapter in either The Prince or the Discourses which has the word “nature” (natura; D 3.43) in the title, the word surprisingly seems to mean something like “custom” or “education.” And the “natural prince” (principe naturale; P 2) seems to be a hereditary prince rather than someone who has a princely nature.

The question of nature is particularly important for an understanding of Machiavelli’s political philosophy, as he says that all human actions imitate nature (D 2.3 and 3.9). The following remarks about human nature will thus be serviceable signposts. For if human actions imitate nature, then it is reasonable to believe that Machiavelli’s account of human nature would gesture toward his account of the cosmos.

One of the key features of Machiavelli’s understanding of human beings is that they are fundamentally acquisitive and appetitive. The root human desire is the “very natural and ordinary” desire to acquire (P 3), which, like all desires, can never be fully satisfied (D 1.37 and 2.pr; FH 4.14 and 7.14). Human beings enjoy novelty; they especially desire new things (D 3.21) or things that they do not have (D 1.5). It is worth noting that, while these formulations are in principle compatible with the acquisition of intellectual or spiritual things, most of Machiavelli’s examples suggest that human beings are typically preoccupied with material things. For example, he says that human beings forget a father’s death more easily than the loss of patrimony (P 17). In other words, they love property more than honor.

Human beings are generally susceptible to deception. They are generally ungrateful and fickle liars (P 17) who judge by what they see (P 18). They tend to believe in appearances (P 18) and also tend to be deceived by generalities (D 1.47, 3.10, and 3.34). It is easy to persuade them of something but difficult to keep them in that persuasion (P 6).

This susceptibility extends to self-deception. Human beings deceive themselves in pleasure (P 23). They are taken more by present things than by past ones (P 24), since they do not correctly judge either the present or the past (D 2.pr). They have little prudence (D 2.11) but great ambition (D 2.20). They always hope (D 2.30; FH 4.18) but do not place limits on their hope (D 2.28), such that they will willingly change lords in the mistaken belief that things will improve (P 3). They share a common defect of overlooking the storm during the calm (P 24), for they are “blind” in judging good and bad counsel (D 3.35). They often act like “lesser birds of prey,” driven by nature to pursue their prey while a larger predator fatally circles above them (D 1.40).

Machiavelli’s remarks upon human nature extend into the moral realm. He says that human beings are envious (D 1.pr) and often controllable through fear (P 17). Consequently, they hate things due to their envy and their fear (D 2.pr). They do not know how to be either altogether bad or altogether good (D 1.30); are more prone to evil than to good (D 1.9); and will always turn out to be bad unless made good by necessity (P 23). In something of a secularized echo of Augustinian original sin, Machiavelli even goes so far at times as to say that human beings are wicked (P 17 and 18) and that they furthermore corrupt others by wicked means (D 3.8). Unlike Augustine, however, he rarely (if ever) upbraids such behavior, and he furthermore does not seem to believe that any redemption of wickedness occurs in the next world.

For Machiavelli, human beings are generally imitative. In other words, they almost always walk on previously beaten paths (P 6). Especially in The Prince, imitation plays an important role. Machiavelli regularly encourages (or at least appears to encourage) his readers to imitate figures such as Cesare Borgia (P 7 and P 13) or Caesar (P 14), as well as certain models (e.g., D 3.33) and the virtue of the past in general (D 2.pr). However, it should be noted that recent work has called into question whether these recommendations are sincere. Machiavelli for instance decries the imitation of bad models in “these corrupt centuries of ours” (D 2.19); and some scholars believe that his recommendations regarding Cesare Borgia and Caesar in particular are attenuated and even completely subverted in the final analysis.

Finally, it is worth noting that some scholars believe that Machiavelli goes so far as to subvert the classical account of a hierarchy or chain of being—either by blurring the boundaries between traditional distinctions (such as principality / republics; good / evil; and even man / woman) or, more radically, by demolishing the account as such. On such a reading, Machiavelli might believe that substances are not determined by their natures or even that there are no natures (and thus no substances).

d. History and Necessity

History (istoria / storia) and necessity (necessità) are two important terms for Machiavelli that remain particularly obscure.

Machiavelli is among the handful of great philosophers who is also a great historian. Although he was interested in the study of nature, his primary interest seemed to be the study of human affairs. He urges the study of history many times in his writings (e.g., P 14, as well as D 1.pr and 2.pr), especially with judicious attention (sensatamente; D 1.23; compare D 3.30). He implies that the Bible is a history (D 2.5) and praises Xenophon’s “life of Cyrus” as a history (P 14; D 2.13, 3.20, 3.22, and 3.39). The Discourses is presented as a philosophical commentary on Livy’s History. And Machiavelli wrote several historical works himself, including the verse Florentine history, I Decannali; the fictionalized biography of Castruccio Castracani; and the Medici-commissioned Florentine Histories. There is no question that he was keenly interested in the historian’s craft, especially the recovery of lost knowledge (e.g., D 1.pr and 2.5).

But what exactly does the historian study? What is history? It is not clear in Machiavelli’s writings whether he believes that time is linear or cyclical. Both accounts are compatible with his suggestions that human nature does not change (e.g., D 1.pr, 1.11, and 3.43) and that imitating the ancients is possible (e.g., D 1.pr). In some places in his writings, he gestures toward a progressive, even eschatological sense of time. His call for a legendary redeemer to unite Italy is a notable example (P 26). In other places, he gestures toward the cyclical account, such as his approximation of the Polybian cycle of regimes (D 1.2) or his suggestion that human events repeat themselves (FH 5.1; compare D 2.5). Scholars thus remain divided on this question. History for Machiavelli might be a process that has its own purposes and to which we must submit. Alternatively, it might be a process that we can master and turn toward our own ends.

In his major works, Machiavelli affords modern historians scant attention. He suggests in the first preface to the Discourses that the readers of his time lack a “true knowledge of histories” (D 1.pr). In the preface to the Florentine Histories, he calls Leonardo Bruni and Poggio Bracciolini “two very excellent historians” but goes on to point out their deficiencies (FH Pref). Machiavelli was friends with the historian Francesco Guicciardini, who commented upon the Discourses. Their philosophical engagement occurred primarily through correspondence, however, and in the major works Machiavelli does not substantively take up Guicciardini’s thought.

Machiavelli speaks more amply with respect to ancient historians. Recent work has pointed to provocative connections between Machiavelli’s thoughts and that of Greek historians, such as Herodotus (quoted at D 3.67), Thucydides (D 3.16 and AW 3.214), Polybius (D 3.40), Diodorus Siculus (D 2.5), Plutarch (D 1.21, 2.1, 2.24 [quoted], 3.12, 3.35, and 3.40), and Xenophon (P 14; D 2.2, 2.13, 3.20, 3.22 [2x], and 3.39 [2x]). Among the Latin historians that Machiavelli studied were Herodian (D 3.6), Justin (quoted at D 1.26 and 3.6), Procopius (quoted at D 2.8), Pliny (FH 2.2), Sallust (D 1.46, 2.8, and 3.6), Tacitus (D 1.29, 2.26, 3.6, and 3.19 [2x]; FH 2.2), and of course Livy.

In 1476, when Machiavelli was eight years old, his father obtained a complete copy of Livy and prepared an index of towns and places for the printer Donnus Nicolaus Germanus. It is therefore fitting that one of Machiavelli’s two most widely known books is ostensibly a commentary on Livy’s History. Machiavelli mentions and quotes Livy many times in his major works. With only a few exceptions (AW 2.13 and 2.24), his treatment of Livy takes place in Discourses. However, Machiavelli regularly alters or omits Livy’s words (e.g., D 1.12) and on occasion disagrees with Livy outright (e.g., D 1.58). There is even a suggestion that working with Livy’s account is akin to working with marble that has been badly blocked out (D 1.11). Only three chapters begin with epigraphic quotations from Livy’s text (D 2.3, 2.23, and 3.10), and in all three cases Livy’s words are modified in some manner. It remains an open question to what extent Machiavelli’s thought is a modification of Livy’s.

As with “history,” the word “necessity” has no univocal meaning in Machiavelli’s writings. Recent work has attempted to explore Machiavelli’s use of this term, with respect not only to his metaphysics but also to his thoughts on moral responsibility. Machiavelli frequently returns to the way that necessity binds, or at least frames, human action. Sometimes, Machiavelli seems to mean that an action is unavoidable, such as the “natural and ordinary necessity” (necessità naturale e ordinaria; P 3) of a new prince offending his newly obtained subjects. He suggests that there are certain rules of counsel that “never fail” (e.g., P 22). He speaks of the necessity that constrains writers (FH 7.6; compare D Ded. Let and D 1.10). And at least twice he mentions an “ultimate necessity” (ultima necessità; D 2.8 and FH 5.11). Sometimes, however, Machiavelli seems to mean that an action is a matter of prudence—meaning a matter of choosing the lesser evil (P 21)—such as using cruelty only “out of the necessity” (per la necessità; P 8) to secure one’s self and to maintain one’s acquisitions. And he suggests that there are rules which “never, or rarely, fail” (e.g., P 3)—that is, rules which admit the possibility of failure and which are thus not strictly necessary.

Machiavelli speaks of the necessities to be alone (D 1.9), to deceive (D 2.13), and to kill others (D 3.30). A Lucchese citizen in the Florentine Histories argues that “things done out of necessity neither should nor can merit praise or blame” (FH 5.11). And in one of the most famous passages concerning necessity, Machiavelli uses the word two different times and, according to some scholars, with two different meanings: “Hence it is necessary [necessario] to a prince, if he wants to maintain himself, to learn to be able not to be good, and to use this and not use it according to necessity” (la necessità; P 25).

Necessity might be a condition to which we must submit ourselves. Alternatively, it might be a condition that we can alter, implying that we can alter the meaning of necessity itself. If what is necessary today might not be necessary tomorrow, then necessity becomes a weaker notion. At the very least, necessity would not be directly opposed to contingency; instead, as some scholars maintain, necessity itself would be contingent in some way and therefore shapeable by human agency.

The beginning of Prince 25 merits close attention on this point. There Machiavelli reports a view that he says is widely held in his day: the belief that our lives are fated or determined to such an extent that it does not matter what we choose to do. Though he admits that he has sometimes been inclined to this position, he ponders a different possibility “so that our free will not be eliminated” (perché il nostro libero arbitrio non sia spento). On this question, some scholars highlight Renaissance versions of the Stoic notion of fate, which contemporaries such as Pietro Pomponazzi seem to have held. Other scholars highlight Machiavelli’s concerns, especially in his correspondence, with astrological determinism (a version of which his friend, Vettori, seems to have held). Two years before he wrote his famous 13-21 September 1506 letter to Giovan Battista Soderini—the so-called Ghiribizzi al Soderini (Musings to Soderini)—Machiavelli wrote a now lost letter to Batolomeo Vespucci, a Florentine teacher of astrology at the University of Padua. In his response to Machiavelli, Vespucci suggests that a wise man can affect the influence of the stars not by altering the stars (which is impossible) but by altering himself.

Still other scholars propose a connection with the so-called Master Argument (kurieon logos) of the ancient Megarian philosopher, Diodorus Cronus. Diodorus denies the possibility of future contingencies, that is, the possibility that future events do not already have a determined truth value. Aristotle famously argues against this view in De Interpretatione; Cicero and Boethius also discuss the issue in their respective treatments of divine providence. Some scholars have suggested that the beginning of Prince 25 not only problematizes Machiavelli’s notion of necessity but also engages with this ancient controversy.

e. Truth

Machiavelli makes a remark concerning military matters that he says is “truer than any other truth” (D 1.21). However, he is most famous for his claim in chapter 15 of The Prince that he is offering the reader what he calls the “effectual truth” (verità effettuale), a phrase he uses there for the only time in all of his writings. Although the effectual truth may pertain to military matters (e.g., P 14 and P 17), it is comprehensive in that it treats all the things of the world and not just military things (P 18). Surprisingly, there is still relatively little work on this fundamental Machiavellian concept. What exactly is the effectual truth?

One way to address this question is to begin with Chapter 15 of The Prince, where Machiavelli introduces the term. Given his stated intention there to “write something useful for whoever understands it,” Machiavelli claims that it is more conveniente to go after the effectual truth than the imagination of things that have never been seen or known “to be in truth” (vero essere; compare FH 8.29). Conveniente is variously rendered by translators as “fitting,” “convenient,” “suitable,” “appropriate,” “proper,” and the like (compare Romulus’ opportunity in P 6). Two things seem to characterize the effectual truth in Chapter 15. Firstly, it is distinguished from what is imagined, particularly imagined republics and principalities (incidentally, this passage is the last explicit mention of a “republic” in the book). Though Machiavelli often appeals to the reader’s imagination with images (e.g., fortune as a woman), the effectual truth seems to appeal to the reader in some other manner or through some other faculty. Whatever it is, the effectual truth does not seem to begin with images of things. Secondly, the effectual truth is more fitting for Machiavelli’s intention of writing something useful for the comprehending reader. The implication seems to be that other (more utopian?) intentions might find the imagination of things a more appropriate rhetorical strategy.

Another way to address this question is to begin with the Dedicatory Letter to The Prince. Machiavelli suggests that those who want to “know well” the natures of princes and peoples are like those who “sketch” (disegnano) landscapes. These sketchers place themselves at high and low vantage points or perspectives in order to see as princes and peoples do, respectively. Scholars have highlighted at least two implications of Machiavelli’s use of this image: that observers see the world from different perspectives; and that it is difficult, if not impossible, to see oneself from one’s own perspective. Machiavelli’s politics, meaning the wider world of human affairs, is always the realm of the partial perspective because politics is always about what is seen. “Everyone sees how you appear,” he says, meaning that even grandmasters of duplicity—such as Pope Alexander VI and the Roman emperor Septimius Severus—must still reveal themselves in some sense to the public eye. The truth begins in ordinary apprehension (e.g., D 1.3, 1.8, 1.12, 2.2, 2.21, 2.27, and 3.34). No one can engage in politics without submitting themselves to what Machiavelli calls “this aspect of the world” (P 18), which to say that no one can act in the world at all without displaying themselves in the very action (if not the result). But precisely because perspective is partial, it is subject to error and indeed manipulation (e.g., D 1.56, 2.pr, and 2.19).

Another way to put this point is to say that the “effect” (effetto) of the effectual truth is always the effect on some observer. Milan is not a wholly new principality as such but instead is new only to Francesco Sforza (P 1). Hannibal’s inhuman cruelty generates respect in the “sight” of his soldiers; by contrast, it generates condemnation in the sight of writers and historians (P 17). Unlike Machiavelli himself, those who damn the tumults of Rome do not see that these disorders actually lead to Roman liberty (D 1.4). It is worth noting that perspectives do not always differ. Sometimes multiple perspectives align, as when Severus is seen as “admirable” both by his soldiers and by the people (P 19; compare AW 1.257). Although the cause in each case differs—the people are “astonished” and “stupefied” (presumably through fear), whereas the soldiers are “reverent” and “satisfied” (presumably through love)—the same effect occurs. Or does it? Some scholars believe that differing causes cannot help but modify effects; in this case, admiration itself would be stained and colored by either love or fear and would be experienced differently as a result.

Machiavelli’s concern with appearance not only pertains to the interpretation of historical events but extends to practical advice, as well. Machiavelli says that a prince should desire to be held merciful and not cruel (though he immediately insists that a prince should take care not to “use this mercy badly”; P 17). And Machiavelli says that what makes a prince contemptible is to be held variable, light, effeminate, pusillanimous, or irresolute (P 19). What matters in politics is how we appear to others—how we are held (tenuto) by others. But how we appear depends upon what we do and where we place ourselves in order to do it. A wise prince for Machiavelli is not someone who is content to investigate causes—including superior causes (P 11), first causes (P 14 and D 1.4), hidden causes (D 1.3), and heavenly causes (D 2.5). Rather, it is someone who produces effects. And there are no effects considered abstractly. Some commentators believe that effects are only effects if they are seen or displayed. They thus see the effectual truth as proto-phenomenological. Others take a stronger line of interpretation and believe that effects are only effects if they produce actual changes in the world of human affairs. Touching rather than seeing might then be the better metaphor for the effectual truth (see P 18).

f. Politics: The Humors

Machiavelli is most famous as a political philosopher. Although he studied classical texts deeply, Machiavelli appears to depart somewhat from the tradition of political philosophy, a departure that in many ways captures the essence of his political position. At least at first glance, it appears that Machiavelli does not believe that the polity is caused by an imposition of form onto matter.

Given that Machiavelli talks of both form and matter (e.g., P 6 and D 1.18), this point deserves unpacking. Aristotle’s position is a useful contrast. For Aristotle, politics is similar to metaphysics in that form makes the city what it is. The difference between a monarchy and a republic is a difference in form. This is not simply a question of institutional arrangement; it is also a question of self-interpretation. Aristotelian political form is something like a lens through which the people understand themselves.

Firstly, it matters whether monarchs or republicans rule, as the citizens of such polities will almost certainly understand themselves differently in light of who rules them. A monarchical “soul” is different from a republican “soul.” Secondly, the factions of the city believe they deserve to rule on the basis of a (partial) claim of justice. Justice is thus the underlying basis of all claims to rule, meaning that, at least in principle, differing views can be brought into proximity to each other. Concord, or at least the potential for it, is both the basis and the aim of the city.

With respect to the first implication, Machiavelli occasionally refers to the six Aristotelian political forms (e.g., D 1.2). He even raises the possibility of a mixed regime (P 3; D 2.6 and 3.1; FH 5.8). But usually he speaks only of two forms, the principality and the republic (P 1). The lines between these two forms are heavily blurred; the Roman republic is a model for wise princes (P 3), and the people can be considered a prince (D 1.58). Machiavelli even at times refers to a prince of a republic (D 2.2). Finally, he says that virtuous princes can introduce any form that they like, with the implication being that form does not constitute the fundamental reality of the polity (P 6).

One explanation is that the reality that underlies all form is what Machiavelli nebulously calls “the state” (lo stato). On this account, political form for Machiavelli is not fundamentally causal; it is at best epiphenomenal and perhaps even nominal. Instead, Machiavelli assigns causality to the elements of the state called “humors” (umori) or “appetites” (appetiti). Some scholars focus on possible origins of this idea (e.g., medieval medicine or cosmology), whereas others focus on the fact that the humors are rooted in desire. Still others focus on the fact that the humors arise only in cities and thus do not seem to exist simply by nature.

Machiavelli says that the city or state is always minimally composed of the humors of the people and the great (P 9 and 19; D 1.4; FH 2.12 and 3.1, but contrast FH 8.19); in some polities, for reasons not entirely clear, the soldiers count as a humor (P 19). The polity is constituted, then, not by a top-down imposition of form but by a bottom-up clash of the humors. And as the humors clash, they generate various political effects (P 9)—these are sometimes good (e.g., “liberty”; D 1.4) and sometimes bad (e.g., “license”; P 17 and D 1.7, 1.37, 3.4 and 3.27; FH 4.1). It is worth noting that a third possibility is “principality,” which according to some scholars looks suspiciously like the imposition of form onto matter (e.g., P 6 and 26; see also FH Pref. and 3.1; compare the “wicked form” of D 3.8). Furthermore, Machiavelli does attribute certain qualities to those who live in republics—greater hatred, greater desire for revenge, and restlessness born from the memory of their previous liberty—which might be absent in those who live in principalities (P 4-5; D 1.16-19 and 2.2; FH 4.1). Such passages appear to bring him in closer proximity to the Aristotelian account than first glance might indicate.

The humors are also related to the second implication mentioned above. Machiavelli distinguishes the humors not by wealth or population size but rather by desire. These desires are inimical to each other in that they cannot be simultaneously satisfied: the great desire to oppress the people, and the people desire not to be oppressed (compare P 9, D 1.16, and FH 3.1). Discord, rather than concord, is thus the basis for the state. Consequently, Machiavelli says that a prince must choose to found himself on one or the other of these humors. Most interpreters have taken him to prefer the humor of the people for any number of reasons, not the least of which may be Machiavelli’s work for the Florentine republic. It is worth noting, though, that Machiavelli’s preference may be pragmatic rather than moral. Government means controlling one’s subjects (D 2.23), and “good government” might mean nothing more than a scorched-earth, Tacitean wasteland which one simply calls peace (P 7).

Although many aspects of Machiavelli’s account of the humors are well understood, some remain mysterious. Firstly, it is unclear what desire characterizes the humor of the soldiers, a third humor that occurs, if not always, at least in certain circumstances. Secondly, in the preface to the Florentine Histories Machiavelli suggests that Florence’s disintegration into multiple “divisions” (divisioni) is unique in the history of republics, but it is unclear how or why the typical humors of the people drove this great subdivide further in Florence (though FH 2 and 3 may offer important clues). Thirdly, it is unclear whether a “faction” (fazione; e.g., D 1.54) and a “sect” (setta; e.g., D 2.5)—each of which plays an important role in Machiavelli’s politics—ultimately reduce to one of the fundamental humors or whether they are instead oriented around something other than desire. Finally, it should be noted that recent work has questioned whether the humors are as distinct as previously believed; whether an individual or group can move between them; and whether they exist on something like a spectrum or continuum. For example, it may be the case that a materially secure people would cease to worry about being oppressed (and might even begin to desire to oppress others in the manner of the great); or that an armed people would effectively act as soldiers (such that a prince would have to worry about their contempt rather than their hatred).

g. Politics: Republicanism

Some scholars claim that Machiavelli is the last ancient political philosopher because he understands the merciless exposure of political life. By contrast, others claim that Machiavelli is the first modern political philosopher because he understands the need to found one’s self on the people. Either position is compatible with a republican reading of Machiavelli. The status of Machiavelli’s republicanism has been the focus of much recent work.

Many scholars focus on Machiavelli’s teaching as it is set forth in the Discourses (though many of the same lessons are found in The Prince). As in The Prince, Machiavelli attributes qualities to republican peoples that might be absent in peoples accustomed to living under a prince (P 4-5; D 1.16-19 and 2.2; FH 4.1). He also distinguishes between the humors of the great and the people (D 1.4-5; P 9). However, in the Discourses he explores more carefully the possibility that the clash between them can be favorable (e.g., D 1.4). He associates both war and expansion with republics and with republican unity; conversely, he associates peace and idleness with republican disunity (D 2.25). He notes the flexibility of republics (D 3.9), especially when they are ordered well (D 1.2) and regularly drawn back to their beginnings (D 3.1; compare D 1.6). He ponders the political utility of public executions and—as recent work has emphasized—courts or public trials (D 3.1; compare the parlements of P 3 and P 19 and Cesare’s court of P 7). He even considers the possibility of a perpetual republic (compare D 3.17 with D 1.20, 1.34, 2.30, 3.1, and 3.22). Like many other authors in the republican tradition, he frequently ponders the problem of corruption (e.g., D 1.17, 1. 18, 1.55, 2.Pr, 2.19, 2.22, 3.1, 3.16, and 3.33).

However, it remains unclear exactly what Machiavelli means by terms such as “corruption,” “freedom,” “law,” and even “republic.” It is therefore not surprising that the content of his republicanism remains unclear, as well. In order to provide a point of entry into this problem, it would be helpful to offer a brief examination of three rival and contemporary positions concerning Machiavelli’s republicanism. Although what follows are stylized and compressed glosses of complicated interpretations, they may serve as profitable beginning points for a reader interested in pursuing the issue further.

One interpretation might be summed up by the Machiavellian phrase “good laws” (e.g., P 12). It holds that Machiavelli is something of a neo-Roman republican. What matters the most, politically speaking, are robust institutions and deliberative participation in public life (e.g., D 1.55). Freedom is the effect of good institutions. Corruption is a moral failing and more specifically a failing of reason. This interpretation focuses upon the stability of public life. A strength of this interpretation is the emphasis that it places upon the rule of law as well as Machiavelli’s understanding of virtue. A possible weakness of this view is that it seems to overlook Machiavelli’s insistence that freedom is a cause of good institutions, not an effect of them (e.g., D 1.4); and that it seems to conflate the Machiavellian humor of the people with a more generic and traditional understanding of “people,” that is, all those who are under the law.

A second interpretation might be summed up by the Machiavellian term “tumults” (e.g., D 1.4). It holds that Machiavelli is something of a radical or revolutionary democrat whose ideas, if comparable to anything classical, are more akin to Greek thought than to Roman. What matters the most, politically speaking, is non-domination. Freedom is a cause of good institutions; freedom is not obedience to any rule but rather the continuous practice of resistance to oppression that undergirds all rules. Corruption is associated with the desire to dominate others. This interpretation focuses upon the instability—and even the deliberate destabilization—of political life. A strength of this interpretation is the emphasis that it places upon tumults, motion, and the more “decent end of the people (P 9; see also D 1.58). A possible weakness is that it seems to understand law in a denuded sense, that is, as merely a device to prevent the great from harming the people; and that it seems to overlook the chaos that might result from factional strife (e.g., P 17) or mob justice (e.g., FH 2.37 and 3.16-17).

A third interpretation, which is something of a middle position between the previous two, might be summed up by the Machiavellian phrase “wise prince” (e.g., P 3). It holds that Machiavelli advocates for something like a constitutional monarchy. What matters the most, politically speaking, is stability of public life and especially acquisitions, coupled with the recognition that such a life is always under assault from those who are dissatisfied. Freedom is both a cause and effect of good institutions. Corruption is associated with a decline (though not a moral decline) in previously civilized human beings. This interpretation focuses both on the stability and instability of political life (e.g., D 1.16). A strength of this interpretation is its emphasis upon understated features—such as courts, public trials, and even elections—in Machiavelli’s thought, and upon Machiavelli’s remarks concerning the infirmity of bodies which lack a “head” (e.g., P 26; D 1.44 and 1.57). A possible weakness is that it seems to downplay Machiavelli’s remarks on nature and consequently places outsized importance upon processes such as training (esercitato), education (educazione), and art (arte).

h. Glory

Glory is one of the key motivations for the various actors in Machiavelli’s corpus. Some scholars go so far as to claim that it is the highest good for Machiavelli. Others deflate its importance and believe that Machiavelli’s ultimate aim is to wean his readers from their desire for glory.

Machiavelli’s understanding of glory (gloria) is substantially beholden to that of the Romans, who were “great lovers of glory” (D 1.37; see also D 1.58 and 2.9). Ancient Romans attained prominence through the acquisition of dignitas, which can be translated as “dignity” but which also included the notion of honors or trophies awarded as recognition of one’s accomplishments. Possessions, titles, family achievements, and land could all contribute to dignitas. But what was most important was gloria, one’s glory and reputation (or lack thereof) for greatness. Plebeians, who did not possess as much wealth or family heritage as patricians, could still attain prominence in the Roman Republic by acquiring glory in speeches (e.g., Cicero) or through deeds, especially in wartime (e.g., Gaius Marius). Typically, this quest for glory occurred “within the system.” A Roman would begin his political career with a lower office (quaestor or aedile) and would attempt to rise to higher positions (tribune, praetor, or consul) by pitting his ambition and excellence in ferocious competition against his fellow citizens.

The destabilization of the Roman Republic was in part due to individuals who short-circuited this system, that is, who achieved glory outside the conventional political pathway. A notable example is Scipio Africanus. At the beginning of his ascendancy, Scipio had never held any political positions and was not even eligible for them. However, by his mid-twenties he had conducted major military reforms. By his mid-thirties, he had defeated no less a general than Hannibal, the most dangerous enemy the Romans ever faced and the “master [or teacher] of war” (maestro di guerra; D 3.10). This unprecedented achievement gained Scipio much glory—at least in the Senate, as Machiavelli notes (though not with Fabius Maximus; P 17 and D 3.19-21). Indeed, Scipio gained so much glory that he catapulted past his peers in terms of renown, regardless of his lack of political accomplishments. Consequently, his imitation was incentivized, which partly led to the rise of the warlords—such as Pompey and Julius Caesar—and the eventual end of the Republic.

Machiavelli’s understanding of glory is beholden to this Roman understanding in at least three ways: the dependence of glory upon public opinion; the possibility of an exceptional individual rising to prominence through nontraditional means; and the proximity of glory to military operations. One useful example of the concatenation of all three characteristics is Agathocles the Sicilian. Agathocles became king of Syracuse after rising from “a mean and abject fortune” (P 8). If one considers the “virtue of Agathocles,” Machiavelli says, one does not see why he should be judged inferior to “any most excellent captain.” Agathocles rose to supremacy with “virtue of body and spirit” and had no aid but that of the military. Indeed, there is little, if anything, that can be attributed to fortune in his ascent. It seems clear for all of these reasons that Agathocles is virtuous on the Machiavellian account. But Machiavelli goes on to say that “one cannot call it virtue” to do what Agathocles did. One cannot call it virtue to keep to a life of crime constantly; to slaughter the senators and the rich; to betray one’s friends; to be without faith, without mercy, without religion. Although such acts are compatible with Machiavellian virtue (and might even comprise it), they cannot be called virtuous according to the standards of conventional morality. Agathocles’ savage cruelty, inhumanity, and infinite crimes do not “permit him to be celebrated” among the most excellent human beings (compare P 6). In general, force and strength easily acquire reputation rather than the other way around (D 1.34). But Machiavelli concludes that Agathocles paid so little heed to public opinion that his virtue was not enough. In the end, Agathocles’ modes enabled him to acquire “empire but not glory” (P 8).

Glory for Machiavelli thus depends upon how you are seen and upon what people say about you. Many of the successful and presumably imitable figures in both The Prince and the Discourses share the quality of being cruel, for example. But even “cruelties well-used” (P 8) are insufficient to maintain your reputation in the long run. This is at least partly why explorations of deceit and dissimulation take on increasing prominence as both works progress (e.g., P 6, 19, and especially 26; D 3.6). One must learn to imitate not only the force of the lion but also the fraud of the fox (P 7, 18, and 19; D 2.13 and 3.40). Doing so might allow one to avoid a “double shame” and instead achieve a “double glory”: beginning a new regime and adorning it with good laws, arms, and examples (P 24).

Whether veneration (venerazione) and reverence (riverenzia) are ultimately higher concepts than glory remains an important question, and recent work has taken it up. Those interested in this question may find it helpful to begin with the following passages: P 6, 7, 11, 17, 19, 23, and 26; D 1.10-12, 1.36, 1.53-54, 2.20, 3.6 and 3.22; FH 1.9, 3.8, 3.10, 5.13, 7.5, and 7.34; and AW 6.163, 7.215, 7.216, and 7.223.

i. Religion

The place of religion in Machiavelli’s thought remains one of the most contentious questions in the scholarship. His brother Totto was a priest. His father appeared to be a devout believer and belonged to a flagellant confraternity called the Company of Piety. When Machiavelli was eleven, he joined the youth branch of this company, and he moved into the adult branch in 1493. From 1500 to 1513, Machiavelli and Totto paid money to the friars of Santa Croce in order to commemorate the death of their father and to fulfill a bequest from their great-uncle. Machiavelli’s actual beliefs, however, remain mysterious. He did write an Exhortation to Penitence (though scholars disagree as to his sincerity; compare P 26). And he did accept the last rites upon his deathbed in the company of his wife and some friends. But evidence in his correspondence—for instance, in letters from close friends such as Francesco Vettori and Francesco Guicciardini—suggests that Machiavelli did not take pains to appear publicly religious.

As with many other philosophers of the modern period, interpretations of Machiavelli’s religious beliefs can gravitate to the extremes: some scholars claim that Machiavelli was a pious Christian, while others claim that he was a militant and unapologetic atheist. Still others claim that he was religious but not in the Christian sense. It remains unclear what faith (fide) and piety (or mercy, pietà) mean for Machiavelli. Perhaps the easiest point on entry is to examine how Machiavelli uses the word “religion” (religione) in his writings.

Machiavelli variously speaks of “the present religion” (la presente religione; e.g., D 1.pr), “this religion” (questa religione; e.g., D 1.55), “the Christian religion” (la cristiana religione; e.g., FH 1.5), and “our religion” (nostra religione; e.g., D 2.2). Machiavelli says that “our religion [has shown] the truth and the true way” (D 2.22; cf. D 3.1 and 1.12), though he is careful not to say that it is the true way. “Our religion” is also contrasted to the curiously singular “ancient religion” (religione antica; D 2.2). Recent work has suggested that Machiavelli’s notion of the ancient religion may be analogous to, or even associated with, the prisca theologia / philosophia perennis which was investigated by Ficino, Pico, and others.

Machiavelli speaks of religious “sects” (sette; e.g., D 2.5), a type of group that seems to have a lifespan between 1,666 and 3,000 years. Species of sects tend to be distinguished by their adversarial character, such as Catholic versus heretical (FH 1.5); Christian versus Gentile (D 2.2); and Guelf versus Ghibelline (P 20). They also generally, if not exclusively, seem to concern matters of theological controversy. It is not clear whether and to what extent a religion differs from a sect for Machiavelli.

Machiavelli suggests that reliance upon certain interpretations—“false interpretations” (false interpretazioni)—of the Christian God has led in large part to Italy’s servitude. Such interpretations implore human beings to think more of enduring their beatings than of avenging them (D 2.2 and 3.27). He seems to allow for the possibility that not all interpretations are false; for example, he says that Francis and Dominic rescue Christianity from elimination, presumably because they return it to an interpretation that focuses upon poverty and the life of Christ (D 3.1). And one of the things that Machiavelli may have admired in Savonarola is how to interpret Christianity in a way that is muscular and manly rather than weak and effeminate (compare P 6 and 12; D 1.pr, 2.2 and 3.27; FH 1.5 and 1.9; and AW 2.305-7).

Some scholars have emphasized the various places where Machiavelli associates Christianity with the use of dissimulation (e.g., P 18) and fear (e.g., D 3.1) as a form of social control. Other scholars believe that Machiavelli adheres to an Averroeist (which is to say Farabian) understanding of the public utility of religion. On such an understanding, religion is necessary and salutary for public morality. The philosopher should therefore take care not to disclose his own lack of belief or at least should attack only impoverished interpretations of religion rather than religion as such.

Finally, recent work has emphasized the extent to which Machiavelli’s concerns appear eminently terrestrial; he never refers in either The Prince or the Discourses to the next world or to another world.

j. Ethics

Machiavelli’s very name has become a byword for treachery and relentless self-interest. His ethical viewpoint is usually described as something like “the end justifies the means” (see for instance D 1.9). Is this a fair characterization?

The easiest point of entry into Machiavelli’s notion of ethics is the concept of cruelty. At least since Montaigne (and more recently with philosophers such as Judith Skhlar and Richard Rorty), this vice has held a special philosophical status. Indeed, contemporary moral issues such as animal ethics, bullying, shaming, and so forth are such contentious issues largely because liberal societies have come to condemn cruelty so severely. It is all the more striking to readers today, then, when they confront Machiavelli’s seeming recommendations of cruelty. Such recommendations are common throughout his works. In the Discourses, Machiavelli appears to recommend a cruel way which is an enemy to every “Christian,” and indeed “human,” way of life (D 1.26); furthermore, he appears to indirectly attribute this way of life to God (via David). In The Prince, he speaks of “cruelties well-used” (P 8) and explicitly identifies almost every imitable character as cruel (e.g., P 7, 8, 19, and 21). He even speaks of “mercy badly used” (P 17).

The fact that seeming vices can be used well and that seeming virtues can be used poorly suggests that there is an instrumentality to Machiavellian ethics that goes beyond the traditional account of the virtues. One could find many places in his writings that support this point (e.g., D 1.pr and 2.6), although the most notable is when he says that he offers something “useful” to whoever understands it (P 15). But what exactly is this instrumentality?

Partly, it seems to come from human nature. We have a “natural and ordinary desire” to acquire (P 3) which can never in principle be satisfied (D 1.37 and 2.pr; FH 4.14 and 7.14). Human life is thus restless motion (D 1.6 and 2.pr), resulting in clashes in the struggle to satisfy one’s desires. It is thus useful as a regulative ideal, and is perhaps even true, that we should see others as bad (D 1.3 and 1.9) and even wicked beings (P 17 and 18) who corrupt others by wicked means (D 3.8). In order to survive in such a world, goodness is not enough (D 3.30). Instead, we must learn how not to be good (P 15 and 19) or even how to enter into evil (P 18; compare D 1.52), since it is not possible to be altogether good (D 1.26). Even “the good” itself is variable (P 25). Thus, virtues and vices serve something outside themselves; they are not purely good or bad. Recognizing this limitation of both virtue and vice is eminently useful.

Another way to put this point is in terms of imitation. While we should often imitate those greater than us (P 6), we should also learn how to imitate those lesser than us. For example, we should imitate animals in order to fight as they do, since human modes of combat, such as law, are often not enough—especially when dealing with those who do not respect laws (P 18). More specifically, we should imitate the lion and the fox. The lion symbolizes force, perhaps to the point of cruelty; the fox symbolizes fraud, perhaps to the point of lying about the deepest things, such as religion (P 18). Everything, even one’s faith (D 1.15) and one’s offspring (P 11), can be used instrumentally.

The mention of the fox brings us to a second profitable point of entry into Machiavellian ethics, namely deception. Machiavelli’s moral exemplars are often cruel, but they are also often dissimulators. One of the clearest examples is Pope Alexander VI, a particularly adroit liar (P 18). Throughout his writings, Machiavelli regularly advocates lying (e.g., D 1.59 and 3.42; FH 6.17), especially for those who attempt to rise from humble beginnings (e.g., D 2.13). He even at one point suggests that it is useful to simulate craziness (D 3.2).

Because cruelty and deception play such important roles in his ethics, it is not unusual for related issues—such as murder and betrayal—to rear their heads with regularity. If Machiavelli possessed a sense of moral squeamishness, it is not something that one easily detects in his works. However, it should be noted that recent work has suggested that many, if not all, of Machiavelli’s shocking moral claims are ironic. If this hypothesis is true, then his moral position would be much more complicated than it appears to be. Does Machiavelli ultimately ask us to rise above considerations of utility? Does he, of all people, ask us to rise above what we have come to see as Machiavellianism?

3. Machiavelli’s Corpus

In what follows, Machiavelli’s four major works are discussed and then his other writings are briefly characterized.

a. The Prince

The Prince is Machiavelli’s most famous philosophical book. It was begun in 1513 and probably completed by 1515. We possess no surviving manuscript copy of it in Machiavelli’s own handwriting. We first hear of it in Machiavelli’s 10 December 1513 letter to his friend, Francesco Vettori, wherein Machiavelli divulges that he has been composing “a little work” entitled De Principatibus. Machiavelli also says that Filippo Casavecchia, a longtime friend, has already seen a rough draft of the text.

Evidence suggests that other manuscript copies were circulating among Machiavelli’s friends, and perhaps beyond, by 1516-17. These manuscripts, some of which we do possess, do not bear the title of The Prince. That title did not appear until roughly five years after Machiavelli’s death, when the first edition of the book was published with papal privilege in 1532.

Which title did Machiavelli intend: the Latin title of De Principatibus (“Of Principalities”); or the Italian title of Il Principe (“The Prince”)? That the book has two purported titles—and that they do not translate exactly into one another—remains an enduring and intriguing puzzle. The structure of The Prince does not settle the issue, as the book begins with chapters that explicitly treat principalities, but eventually proceeds to chapters that explicitly treat princes. Nor does the content settle the issue; the chapter titles are in Latin but the body of each chapter is in Italian, and the words “prince” and “principality” occur frequently throughout the entire book. Lastly, the Discourses offer no easy resolution; Machiavelli there refers to The Prince both as “our treatise of principalities” (nostro trattato de’ principati; D 2.1) and “our treatise of the Prince” (nostro trattato de Principe; D 3.42).

The Prince is composed of twenty-six chapters which are preceded by a Dedicatory Letter to Lorenzo de’ Medici (1492-1519), the grandson of Lorenzo the Magnificent (1449-92). As we learn from the aforementioned letter to Vettori, Machiavelli had originally intended to dedicate The Prince to Lorenzo the Magnificent’s son, Giuliano. At some point, for reasons not entirely clear, Machiavelli changed his mind and dedicated to the volume to Lorenzo. We do not know whether Giuliano or Lorenzo ever read the work. There is an old story, perhaps apocryphal, that Lorenzo preferred a pack of hunting dogs to the gift of The Prince and that Machiavelli consequently swore revenge against the Medici. At any rate, the question of the precise audience of The Prince remains a key one. Some interpreters have even suggested that Machiavelli writes to more than one audience simultaneously.

The question of authorial voice is also important. Machiavelli himself appears as a character in The Prince twice (P 3 and 7) and sometimes speaks in the first person (e.g., P 2 and P 13). However, it is not obvious how to interpret these instances, with some recent scholars going so far as to say that Machiavelli operates with the least sincerity precisely when speaking in his own voice. This issue is exacerbated by the Dedicatory Letter, in which Machiavelli sets forth perhaps the foundational image of the book. He compares “those who sketch [disegnano]” landscapes from high and low vantage points to princes and peoples, respectively. And he suggests that “to know well” the nature of peoples one needs to a prince, and vice versa. The suggestion seems to be that Machiavelli throughout the text variously speaks to one or the other of these vantage points and perhaps even variously speaks from one or the other of these vantage points. At the very least, the image implies that we should be wary of taking his claims in a straightforward manner. The sketcher image becomes even more complicated later in the text, when Machiavelli introduces the perspectives of two additional “humors” of the city, that is, the great (i grandi; P 9) and the soldiers (i soldati; P 19).

An additional interpretative difficulty concerns the book’s structure. In the first chapter, Machiavelli appears to give an outline of the subject matter of The Prince. But this subject matter appears to be exhausted as early as Chapter 7. What, then, to make of the rest of the book? One possibility is that The Prince is not a polished work; some scholars have suggested that it was composed in haste and that consequently it might not be completely coherent. An alternative hypothesis is that Machiavelli has some literary or philosophical reason to break from the structure of the outline, keeping with his general trajectory of departing from what is customary. A third hypothesis is that the rest of the book is somehow captured by the initial outline and that what Machiavelli calls “threads” (orditi; P2) or “orders” (ordini; P 10) flow outward, if only implicitly, from the first chapter.

Whatever interpretation one holds to, the subject matter of the book seems to be arranged into roughly four parts: Chapters 1-11 treat principalities (with the possible exception of Chapter 5); Chapters 12-14 treat the art of war; Chapters 15-19 treat princes; and Chapters 20-26 treat what we may call the art of princes. The first three sections, at least, are suggested by Machiavelli’s own comments in the text. In Chapter 12, Machiavelli says that he has previously treated the acquisition and maintenance of principalities and says that the remaining task is to discourse generally on offensive and defensive matters. Similarly, in Chapter 15, Machiavelli says that what remains is to see how a prince should act with respect to subjects and friends, implying minimally that what has come previously is a treatment of enemies.

Almost from its composition, The Prince has been notorious for its seeming recommendations of cruelty; its seeming prioritization of autocracy (or at least centralized power) over more republican or democratic forms; its seeming lionization of figures such as Cesare Borgia and Septimius Severus; its seeming endorsements of deception and faith-breaking; and so forth. Indeed, it remains perhaps the most notorious work in the history of political philosophy. One should be wary, however, of resting with what seems to be the case in The Prince, especially given Machiavelli’s repeated insistence that appearances can be manipulated. But the meaning of these manipulations, and indeed of these appearances, remains a scholarly question. Interpreters of the caliber of Rousseau and Spinoza have believed The Prince to bear a republican teaching at its core. Some scholars have gone so far as to see it as an utterly satirical or ironic work. Others have insisted that the book is even more dangerous than it first appears. At any rate, how The Prince fits together with the Discourses (if at all) remains one of the enduring puzzles of Machiavelli’s legacy.

b. Discourses on Livy

There is reason to suspect that Machiavelli had begun writing the Discourses as early as 1513; for instance, there seems to be a reference in The Prince to another, lengthier work on republics (P 2). And since the Discourses references events from as late as 1517, it seems to have still been a work in progress by that point and perhaps even later.

Evidence suggests that manuscript copies were circulating by 1530 and perhaps earlier. We do not possess any of these manuscripts; in fact, we possess no manuscript of the Discourses in Machiavelli’s handwriting except for what is now known as the preface to the first book. It bears no heading and begins with a paragraph that our other manuscripts do not have. There is still debate over whether this paragraph should be excised (since it is not found in the other manuscripts) or whether it should be retained (since it is found in the only polished writing we have of the Discourses in Machiavelli’s hand). It is typically retained in English translations.

Roughly four years after Machiavelli’s death, the first edition of the Discourses was published with papal privilege in 1531. As with The Prince, there is a bit of mystery surrounding the title of the Discourses. The book appeared first in Rome and then a few weeks later in Florence, with the two publishers (Blado and Giunta, respectively) seemingly working with independent manuscripts. Both the Blado and Giunta texts give the title of Discorsi sopra la prima deca di Tito Livio. The reference is to Livy’s History of Rome (Ab Urbe Condita) and more specifically to its first ten books. Machiavelli refers simply to Discorsi in the Dedicatory Letter to the work, however, and it is not clear whether he intended the title to specifically pick out the first ten books by name. Additionally, some of Machiavelli’s contemporaries, such as Guicciardini, do not name the book by the full printed title. Today, the title is usually given as the Discourses on Livy (or the Discourses for short).

The number of chapters in the Discourses is 142, which is the same number of books in Livy’s History. This is a curious coincidence and one that is presumably intentional. But what is the intent? Scholars are divided on this issue. A second, related curiosity is that the manuscript as we now have it divides the chapters into three parts or books. However, the third part does not have a preface as the first two do.

As with the dedicatory letter to The Prince, there is also a bit of mystery surrounding the dedicatory letter to the Discourses. The work is dedicated to Zanobi Buondelmonti and Cosimo Rucellai, two of Machiavelli’s friends, of whom Machiavelli says in the letter that they deserve to be princes even though they are not. It is noteworthy that the Discourses is the only one of the major prose works dedicated to friends; by contrast, The Prince, the Art of War, and the Florentine Histories are all dedicated to potential or actual patrons.

Machiavelli makes his presence known from the very beginning of the Discourses; the first word of the work is the first person pronoun, “Io.” And indeed the impression that one gets from the book overall is that Machiavelli takes fewer pains to recede into the background here than in The Prince. The Discourses is, by Machiavelli’s admission, ostensibly a commentary on Livy’s history. In the preface to the first book, Machiavelli laments the fact that there is no longer a “true knowledge of histories” (vera cognizione delle storie) and judges it necessary to write upon the books of Livy that have not been intercepted by “the malignity of the times” (la malignità de’ tempi). He claims that those who read his writings can “more easily draw from them that utility [utilità] for which one should seek knowledge of histories” (D I.pr). However, it is a strange kind of commentary: one in which Machiavelli regularly alters or omits Livy’s words (e.g., D 1.12) and in which he disagrees with Livy outright (e.g., D 1.58).

Clues as to the structure of the Discourses may be gleaned from Machiavelli’s remarks in the text. At the end of the first chapter (D 1.1), Machiavelli distinguishes between things done inside and outside the city of Rome. He further distinguishes between things done by private and public counsel. Finally, he claims that the first part or book will treat things done inside the city by public counsel. The first part, then, primarily treats domestic political affairs. Machiavelli says that the second book concerns how Rome became an empire, that is, it concerns foreign political affairs (D 2.pr). If Machiavelli did in fact intend there to be a third part, the suggestion seems to be that it concerns affairs conducted by private counsel in some manner. It is noteworthy that fraud and conspiracy (D 2.13, 2.41, and 3.6), among other things, become increasingly important topics as the book progresses. At first glance, it is not clear whether the teaching of the Discourses complements that of The Prince or whether it militates against it. Scholars remain divided on this issue. Some insist upon the coherence of the books, either in terms of a more nefarious teaching typically associated with The Prince; or in terms of a more consent-based, republican teaching typically associated with the Discourses. Others see the Discourses as a later, more mature work and take its teaching to be truer to Machiavelli’s ultimate position, especially given his own work for the Florentine republic. At any rate, how the books fit together remains perhaps the preeminent puzzle concerning Machiavelli’s philosophy. The Discourses nevertheless remains one of the most important works in modern republican theory. It had an enormous effect on republican thinkers such as Rousseau, Montesquieu, Hume, and the American Founders. (See “Politics: Republicanism” above.)

c. Art of War

The Art of War is the only significant prose work published by Machiavelli during his lifetime and his only attempt at writing a dialogue in the humanist tradition. It was probably written in 1519. The first edition was published in 1521 in Florence under the title Libro della arte della Guerra di Niccolò Machiavegli cittadino et segretario fiorentino. It takes the literary form of a dialogue divided into seven books and preceded by a preface. Like The Prince, the work is dedicated to a Lorenzo—in this case, Lorenzo di Filippo Strozzi, “Florentine Patrician.” Strozzi was either a friend (as has been customarily held) or a patron (as recent work suggests). It is worth noting in passing that we possess autograph copies of two of Strozzi’s works in Machiavelli’s hand (Commedia and Pistola).

The action of the Art of War takes place after dinner and in the deepest and most secret shade (AW 1.13) of the Orti Oricellari, the gardens of the Rucellai family. These gardens were cultivated by Bernardo Rucellai, a wealthy Florentine who was a disciple of Ficino and who was also the uncle of two Medici popes, Leo X and Clement VII (via his marriage to Nannina, the eldest sister of Lorenzo the Magnificent). Bernardo filled the gardens with plants mentioned in classical texts (AW 1.13-15) and intended the place to be a center of humanist discussion. Ancient philosophy, literature, and history were regularly discussed there, in addition to contemporary works on occasion (for example, some of Machiavelli’s Discourses on Livy). Visitors included Machiavelli, Guicciardini, and members of Ficino’s so-called Platonic Academy. Notably, the gardens were the site of at least two conspiracies: an aristocratic one while Florence was a republic under the rule of Soderini (1498-1512); and a republican one, headed up by Cosimo Rucellai, after the Medici regained control in 1512. Conspiracy is one of the most extensively examined themes in Machiavelli’s corpus: it is the subject of both the longest chapter of The Prince (P 19) and the longest chapter of the Discourses (D 3.6; see also FH 2.32, 7.33, and 8.1).

One of the interlocutors of the Art of War is Bernardo’s grandson, Cosimo Rucellai, who is also one of the dedicatee of the Discourses. The other dedicatee of the Discourses, Zanobi Buondelmonti, is also one of the interlocutors of the Art of War. Two of the other young men present are Luigi Alammani (to whom Machiavelli dedicated the Life of Castruccio Castracani along with Zanobi) and Battista della Palla. But perhaps the most important and striking speaker is Fabrizio Colonna. Colonna was a mercenary captain—notable enough, given Machiavelli’s insistent warnings against mercenary arms (e.g., P 12-13 and D 1.43). However, Colonna was also the leader of the Spanish forces that compelled the capitulation of Soderini and that enabled the Medici to regain control of Florence.

In the preface to the work, Machiavelli notes the vital importance of the military: he compares it to a palace’s roof, which protects the contents (compare FH 6.34). And he laments the corruption of modern military orders as well as the modern separation of military and civilian life (AW Pref., 3-4). Roughly speaking, books 1 and 2 concern issues regarding the treatment of soldiers, such as payment and discipline. Books 3 and 4 concern issues regarding battle, such as tactics and formation. Book 5 concerns issues regarding logistics, such as supply lines and the use of intelligence. Book 6 concerns issues regarding the camp, including a comparison to the way that the Romans organized their camps. Book 7 concerns issues regarding armament, such as fortifications and artillery. Like The Prince, the Art of War ends with an indictment of Italian princes with respect to Italy’s weak and fragmented situation.

Many Machiavellian themes from The Prince and the Discourses recur in the Art of War. Some examples are: the importance of one’s own arms (AW 1.180; P 6-9 and 12-14; D 2.20); modern misinterpretations of the past (AW 1.17; D 1.pr and 2.pr); the way that good soldiers arise from training rather than from nature (AW 1.125 and 2.167; D 1.21 and 3.30-9); the need to divide an army into three sections (AW 3.12ff; D 2.16); the willingness to adapt to enemy orders (AW 4.9ff; P 14; D 3.39); the importance of inspiring one’s troops (AW 4.115-40; D 3.33); the importance of generating obstinacy and resilience in one’s troops (AW 4.134-48 and 5.83; D 1.15); and the relationship between good arms and good laws (AW 1.98 and 7.225; P 12).

Strong statements throughout his corpus hint at the immensely important role of war in Machiavelli’s philosophy. In The Prince, Machiavelli says that a prince should focus all of his attention upon becoming a “professional” in the art of war (professo; compare the “professions” of AW Pref. and P 15), for “that is the only art which is of concern to one who commands” (P 14). In the Discourses, he says that it is “truer than any other truth” that it is always a prince’s defect (rather than a defect of a site or nature) when human beings cannot be made into soldiers (D 1.21). And his only discussion of science in The Prince or the Discourses comes in the context of hunting as an image of war (D 3.39). Such statements, along with Machiavelli’s dream of a Florentine militia, point to the key role of the Art of War in Machiavelli’s corpus. But the technical nature of its content, if nothing else, has proved to be a resilient obstacle for scholars who attempt to master it, and the book remains the least studied of his major works.

d. Florentine Histories

This is the last of Machiavelli’s major works. It was not his first attempt at penning a history; Machiavelli had already written a two-part verse history of Italy, I Decennali, which covers the years 1492-1509. But the Florentine Histories is a greater effort. It is written in prose and covers the period of time from the decline of the Roman Empire until the death of Lorenzo the Magnificent in 1434.

The Florentine Histories was commissioned in 1520 by Pope Leo X, on behalf of the Officers of Study of Florence. The intervention of Cardinal Giulio de’ Medici was key; the Histories would be dedicated to him and presented to him in 1525, by which time he had ascended to the papacy as Clement VII. Machiavelli presented eight books to Clement and did not write any additional ones. They were not published until 1532.

Although Giulio had made Machiavelli the official historiographer of Florence, it is far from clear that the Florentine Histories are a straightforward historiographical account. Machiavelli says in the Dedicatory Letter that he is writing of “those times which, through the death of the Magnificent Lorenzo de’ Medici, brought a change of form [forma] in Italy.” He says that he has striven to “satisfy everyone” while “not staining the truth.” In the Preface, Machiavelli says that his intent is to write down “the things done inside and outside [the city] by the Florentine people” (le cose fatte dentro e fuora dal popolo fiorentino) and that he changed his original intention in order that “this history may be better understood in all times.”

Though Book 1 is ostensibly a narrative concerning the time from the decline of the Roman Empire, in Book 2 he calls Book 1 “our universal treatise” (FH 2.2), thus implying that it is more than a simple narrative. Books 2, 3, and 4 concern the history of Florence itself from its origins to 1434. Books 5, 6, 7, and 8 concern Florence’s history against the background of Italian history.

In Book 1, Machiavelli explores how Italy has become disunited, in no small part due to causes such as Christianity (FH 1.5) and barbarian invasions (FH 1.9). The rise of Charlemagne is also a crucial factor (FH 1.11). Machiavelli notes that Christian towns have been left to the protection of lesser princes (FH 1.39) and even no prince at all in many cases (FH 1.30), such that they “wither at the first wind” (FH 1.23).

In Book 2, Machiavelli famously calls Florence “[t]ruly a great and wretched city” (Grande veramente e misera città; FH 2.25). Scholars have long focused upon how Machiavelli thought Florence was wretched, especially when compared to ancient Rome. But recent work has begun to examine the ways in which Machiavelli thought that Florence was great, as well; and on the overlap between the Histories and the Discourse on Florentine Affairs (which was also commissioned by the Medici around 1520). Book 2 also examines the ways in which the nobility disintegrates into battles between families (e.g., FH 2.9) and into various splinter factions of Guelfs (supporters of the Pope) and Ghibellines (supporters of the Emperor). The rise of Castruccio Castracani, alluded to in Book 1 (e.g., FH 1.26), is further explored (FH 2.26-31), as well as various political reforms (FH 2.28 and 2.39).

Books 3 and 4 are especially notable for Machiavelli’s analysis of the class conflicts that exist in every polity (e.g., FH 3.1), and some scholars believe that his treatment here is more developed and nuanced than his accounts in either The Prince or the Discourses. Machiavelli also narrates the rise of several prominent statesmen: Salvestro de’ Medici (FH 3.9); Michele di Lando (FH 3.16-22; compare FH 3.13); Niccolò da Uzzano (FH 4.2-3); and Giovanni di Bicci de’ Medici (FH 4.3 and 4.10-16), whose family is in the ascendancy at the end of Book 4.

Books 5 and 6 ostensibly concern the rise of the Medici, and indeed one might view Cosimo’s ascent as something of the central event of the Histories (see for instance FH 5.4 and 5.14). Yet in fact Machiavelli devotes the majority of Books 5 and 6 not to the Medici but rather to the rise of mercenary armies in Italy (compare P 12 and D 2.20). Among the topics that Machiavelli discusses are the famous battle of Anghiari (FH 5.33-34); the fearlessness of mercenary captains to break their word (FH 6.17); the exploits of Francesco Sforza (e.g., FH 6.2-18; compare P 1, 7, 12, 14, and 20 as well as D 2.24); and the propensity of mercenaries to generate wars so that they can profit (FH 6.33; see also AW 1.51-62).

Books 7 and 8 principally concern the rise of the Medici—in particular Cosimo; his son, Piero the Gouty; and his son in turn, Lorenzo the Magnificent. Cosimo (though “unarmed”) dies with great glory and is famous largely for his liberality (FH 7.5) and his attention to city politics: he prudently and persistently married his sons into wealthy Florentine families rather than foreign ones (FH 7.6). Cosimo also loved classical learning to such an extent that he brought John Argyropoulos and Marsilio Ficino to Florence. Additionally, Cosimo left a strong foundation for his descendants (FH 7.6). Piero is highlighted mainly for lacking the foresight and prudence of his father; for fomenting popular resentment; and for being unable to resist the ambition of the great. Nonetheless, Machiavelli notes Piero’s “virtue and goodness” (FH 7.23). Lorenzo is noted for his youth (F 7.23); his military prowess (FH 7.12); his desire for renown (FH 8.3); his eventual bodyguard of armed men due to the Pazzi assassination attempt (FH 8.10); and his many amorous endeavors (FH 8.36). The Histories end with the death of Lorenzo.

The Histories has received renewed attention in recent years, and scholars have increasingly seen it as not merely historical but also philosophical—in other words, as complementary to The Prince and the Discourses.

e. Other Works

Machiavelli’s other writings are briefly described here. Every single work is not listed; instead, emphasis has been placed upon those that seem to have philosophical resonance.

Some of Machiavelli’s writings treat historical or political topics. In the early 1500s, he wrote several reports and speeches. They are notable for their topics and for the way in which they contain precursors to important claims in later works, such as The Prince. Among other things, Machiavelli wrote on how Duke Valentino killed Vitellozzo Vitelli (compare P 7); on how Florence tried to suppress the factions in Pistoia (compare P 17); and how to deal with the rebels of Valdichiana.

In 1520, Machiavelli wrote a fictionalized biography, The Life of Castruccio Castracani. Many important details of Castruccio’s life are changed and stylized by Machiavelli, perhaps in the manner of Xenophon’s treatment of Cyrus. The most obvious changes are found in the final part, where Machiavelli attributes to Castruccio many sayings that are in fact almost exclusively drawn from the Lives of Diogenes Laertius. Some scholars believe that Machiavelli’s account is also beholden to the various Renaissance lives of Tamerlane—for instance, those by Poggio Bracciolini and especially Enea Silvio Piccolomini, who would become Pope Pius II and whose account became something of a genre model.

Also around 1520, Machiavelli wrote the Discourse on Florentine Affairs. Recent work has suggested the proximity in content between this work and the Florentine Histories. Also of interest is On the Natures of Florentine Men, which is an autograph manuscript which Machiavelli may have intended as a ninth book of the Florentine Histories.

Toward the end of his tenure in the Florentine government, Machiavelli wrote two poems in terza rima called I Decennali. The first seems to date from 1504-1508 and concerns the history of Italy from 1492 to 1503. It is the only work that Machiavelli published while in office. The second seems to date from around 1512 and concerns the history of Italy from 1504 to 1509. Among other things, they are precursors to concerns found in the Florentine Histories.

In general, between 1515 and 1527, Machiavelli turned more consciously toward art. He wrote a play called Le Maschere (The Masks) which was inspired by Aristophanes’ Clouds but which has not survived. Three of Machiavelli’s comedies have survived, however. L’Andria (The Girl from Andros) is a translation of Terence and was probably written between 1517 and 1520. Mandragola was probably written between 1512 and 1520; was first published in 1524; and was first performed in 1526. While original, it hearkens to the ancient world especially in how its characters are named (e.g., Lucrezia, Nicomaco). It is by far the most famous of the three and indeed is one of the most famous plays of the Renaissance. It contains many typical Machiavellian themes, the most notable of which are conspiracy and the use of religion as a mask for immoral purposes. The last of Machiavelli’s plays, Clizia, is an adaptation of Plautus. It was probably written in the early 1520s. In recent years, scholars have increasingly treated all three of these plays with seriousness and indeed as philosophical works in their own right.

In addition to I Decannali, Machiavelli wrote other poems. I Capitoli contains tercets which are dedicated to friends and which treat the topics of ingratitude, fortune, ambition, and opportunity (with virtue being notably absent). The Ideal Ruler is in the form of a pastoral. L’Asino (The Golden Ass) is unfinished and in terza rima; it has been called an “anti-comedy” and was probably penned around 1517. Between 1510 and 1515, Machiavelli wrote several sonnets and at least one serenade.

There are some other miscellaneous writings with philosophical import, most of which survive in autograph copies and which have undetermined dates of composition. Machiavelli wrote a Dialogue on Language in which he discourses with Dante on various linguistic concerns, including style and philology. Articles for a Pleasure Company is a satire on high society and especially religious confraternities. Belfagor is a short story that portrays, among other things, Satan as a wise and just prince. An Exhortation to Penitence unsurprisingly concerns the topic of penitence; the sincerity of this exhortation, however, remains a scholarly question.

Lastly, Machiavelli’s correspondence is worth noting. Some of his letters are diplomatic dispatches (the so-called “Legations”); others are personal. The Legations date from the period that Machiavelli worked for the Florentine government (1498-1512). The personal letters date from 1497 to 1527. Machiavelli’s nephew, Giuliano de’ Ricci, is responsible for assembling the copies of letters that Machiavelli had made. Particularly notable among the personal letters are the 13-21 September 1506 letter to Giovanbattista Soderini, the so-called Ghiribizzi al Soderini (Musings to Soderini); and the 10 December 1513 letter to Francesco Vettori, wherein Machiavelli first mentions The Prince.

4. Possible Philosophical Influences on Machiavelli

Machiavelli insists upon the novelty of his enterprise in several places (e.g., P 15 and D 1.pr). It is true that Machiavelli is particularly innovative and that he often appears to operate “without any respect” (sanza alcuno rispetto), as he puts it, toward his predecessors. As a result, some interpreters have gone so far as to call him the inaugurator of modern philosophy. But all philosophers are to some degree in conversation with their predecessors, even (or perhaps especially) those who seek to disagree fundamentally with what has been thought before. Thus, even with a figure as purportedly novel as Machiavelli, it is worth pondering historical and philosophical influences.

a. Renaissance Humanism

Although Machiavelli studied ancient humanists, he does not often cite them as authorities. In his own day, the most widely cited discussion of the classical virtues was Book 1 of Cicero’s De officiis. But Cicero is never named in The Prince (although Machiavelli does allude to him via the images of the fox and the lion in P 18-19) and is named only three times in the Discourses (D 1.4, 1.33, and 1.52; see also D 1.28, 1.56, and 1.59). Other classical thinkers in the humanist tradition receive similar treatment. Juvenal is quoted three times (D 2.19, 2.24, and 3.6). Virgil is quoted once in The Prince (P 17) and three times in the Discourses (D 1.23, 1.54, and 2.24). This trend tends to hold true for later thinkers, as well. Petrarch, whom Machiavelli particularly admired, is never mentioned in the Discourses, although Machiavelli does end The Prince with four lines from Petrarch’s Italia mia (93-96). One may see this relative paucity of references as suggestive that Machiavelli did not have humanist concerns. But it is possible to understand his thought as having a generally humanist tenor.

It is worth remembering that the humanists of Machiavelli’s day were almost exclusively professional rhetoricians. Though they did treat problems in philosophy, they were primarily concerned with eloquence. The revival of Greek learning in the Italian Renaissance did not change this concern and in fact even amplified it. New translations were made of ancient works, including Greek poetry and oratory, and rigorous (and in some ways newfound) philological concerns were infused with a sense of grace and nuance not always to be found in translations conducted upon the model of medieval calques. A notable example is Coluccio Salutati, who otherwise bore a resemblance to medieval rhetoricians such as Petrus de Vineis but who believed, unlike the medievals, that the best way to achieve eloquence was to imitate ancient style as concertedly as possible.

Machiavelli’s writings bear the imprint of his age in this regard. But what exactly is this imprint? What exactly is Machiavellian eloquence? Fellow philosophers have differed in their opinions. Adam Smith considered Machiavelli’s tone to be markedly cool and detached, even in discussions of the egregious exploits of Cesare Borgia. By contrast, Nietzsche understood Machiavelli’s Italian to be vibrant, almost galloping; and he thought that The Prince in particular imaginatively transported the reader to Machiavelli’s Florence and conveyed dangerous philosophical ideas in a boisterous “allegrissimo.” It is not unusual for interpreters to take one or the other of these stances today: to see Machiavelli’s works as dry and technical; or to see them as energetic and vivacious.

Recent work has examined not only Machiavelli’s eloquence but also his images, metaphors, and turns of phrase. “At a stroke” (ad un tratto) and “without any respect” (sanza alcuno rispetto) are two characteristic examples that Machiavelli frequently deploys. There has also been recent work on the many binaries to be found in Machiavelli’s works—such as virtue / fortune; ordinary / extraordinary; high / low; manly / effeminate; principality / republic; and secure / ruin. Machiavelli’s wit and his use of humor more generally have also been the subjects of recent work. Finally, increasing attention has been paid to other rhetorical devices, such as when Machiavelli speaks in his own voice; when he uses paradox, irony, and hyperbole; when he modifies historical examples for his own purposes; when he appears as a character in his narrative; and so forth. And some scholars have gone so far as to say that The Prince is not a treatise (compare D 2.1) but rather an oration, which follows the rules of classical rhetoric from beginning to end (and not just in Chapter 26). In short, it is increasingly a scholarly trend to claim that one must pay attention not only to what Machiavelli says but how he says it.

b. Renaissance Platonism

There is still a remarkable gap in the scholarship concerning Machiavelli’s possible indebtedness to Plato. One reason for this lacuna might be that Plato is never mentioned in The Prince and is mentioned only once in the Discourses (D 3.6). But there was certainly a widespread and effervescent revival of Platonism in Florence before and during Machiavelli’s lifetime.

What exactly is meant here, however? “Platonism” itself is a decidedly amorphous term in the history of philosophy. There are few, if any, doctrines that all Platonists have held, as Plato himself did not insist upon the dogmatic character of either his writings or his oral teaching. To which specific variety of Platonism was Machiavelli exposed? The two most instrumental figures with respect to transmitting Platonic ideas to Machiavelli’s Florence were George Gemistos Plethon and Marsilio Ficino.

Plethon visited Florence in 1438 and 1439 due to the Council of Florence, the seventeenth ecumenical council of the Catholic Church (Plethon himself opposed the unification of the Greek and Latin Churches). Cosimo de’ Medici was also enormously inspired by Plethon (as was John Argyropoulos; see FH 7.6); Ficino says in a preface to ten dialogues of Plato, written for Cosimo, that Plato’s spirit had flown from Byzantium to Florence. And he says in a preface to his version of Plotinus that Cosimo had been so deeply impressed with Plethon that the meeting between them had led directly to the foundation of Ficino’s so-called Platonic Academy.

The son of Cosimo de’ Medici’s physician, Ficino was a physician himself who also tutored Lorenzo the Magnificent. Ficino became a priest in 1473, and Lorenzo later made him canon of the Duomo so that he would be free to focus upon his true love: philosophy. Like Plethon, Ficino believed that Plato was part of an ancient tradition of wisdom and interpreted Plato through Neoplatonic successors, especially Proclus, Dionysius the Areopagite, and St. Augustine. Ficino died in 1499 after translating into Latin an enormous amount of ancient philosophy, including commentaries; and after writing his own great work, the Platonic Theology, a work of great renown that probably played no small role in the 1513 Fifth Lateran Council’s promulgation of the dogma of the immortality of the soul.

In the proem to the Platonic Theology, Ficino calls Plato “the father of philosophers” (pater philosophorum). In the Florentine Histories and in the only instance of the word “philosophy” (filosofia) in the major works, Machiavelli calls Ficino himself the “second father of Platonic philosophy” (secondo padre della platonica filosofia [FH 7.6]; compare FH 6.29, where Stefano Porcari of Rome hoped to be called its “new founder and second father” [nuovo fondatore e secondo padre]). And Machiavelli calls the syncretic Platonist Pico della Mirandola “a man almost divine [uomo quasi che divino]” (FH 8.36). Some scholars believe that Machiavelli critiques both Plato and Renaissance Platonism in such passages. Others, especially those who have problematized the sincerity of Machiavelli’s shocking moral claims, believe that this passage suggests a proximity between Machiavellian and Platonic themes.

Finally, Machiavelli’s father, Bernardo, is the principal interlocutor in Bartolomeo Scala’s Dialogue on the Laws and appears there as an ardent admirer of Plato.

c. Renaissance Aristotelianism

Aristotle is never mentioned in The Prince and is mentioned only once in the Discourses in the context of a discussion of tyranny (D 3.26). This has led some scholars to claim that Machiavelli makes a clean and deliberate break with Aristotelian philosophy. Other scholars, particularly those who see Machiavelli as a civic humanist, believe that Aristotle’s notions of republicanism and citizenship inform Machiavelli’s own republican idiom.

As with the question concerning Plato, the question of whether Aristotle influenced Machiavelli would seem to depend at least in part on the Aristotelianism to which he was exposed. Scholars once viewed the Renaissance as the rise of humanism and the rediscovery of Platonism, on the one hand; and the decline of the prevailing Aristotelianism of the medieval period, on the other. But, if anything, the reputation of Aristotle was only strengthened in Machiavelli’s time.

Italian scholastic philosophy was its own animal. Italy was exposed to more Byzantine influences than any other Western country. Furthermore, unlike a country such as France, Italy also had its own tradition of culture and inquiry that reached back to classical Rome. It is simply not the case that Italian Aristotelianism was displaced by humanism or Platonism. Indeed, perhaps from the late 13th century, and certainly by the late 14th, there was a healthy tradition of Italian Aristotelianism that stretched far into the 17th century. The main difference between the Aristotelian scholastics and their humanist rivals was one of subject matter. Whereas the humanists were rhetoricians who focused primarily on grammar, rhetoric, and poetry, the scholastics were philosophers who focused upon logic and natural philosophy. In Machiavelli’s day, university chairs in logic and natural philosophy were regularly held by Aristotelian philosophers, and lecturers in moral philosophy regularly based their material on Aristotle’s Nicomachean Ethics and Politics. And the Eudemian Ethics was translated for the first time.

Assessing to what extent Machiavelli was influenced by Aristotle, then, is not as easy as simply seeing whether he accepts or rejects Aristotelian ideas, because some ideas—or at least the interpretations of those ideas—are much more compatible with Machiavelli’s philosophy than others. It seems likely that Machiavelli did not agree fully with the Aristotelian position on political philosophy. But Alexander of Aphrodisias’ interpretation that the soul was mortal might be much more in line with Machiavelli’s position, and this view was widely known in Machiavelli’s day. Another candidate might be Pietro Pomponazzi’s prioritization of the active, temporal life over the contemplative life. A third candidate might be any of the various and so-called Averroist ideas, many of which underwent a revival in Machiavelli’s day (especially in places like Padua). Recent work has explored this final candidate in particular.

d. Xenophon

Xenophon is mentioned only once in The Prince (P 14). However, he is mentioned seven times in the Discourses (D 2.2, 2.13, 3.20, 3.22 [2x], and 3.39 [2x]), which is more than any other historian except for Livy. Machiavelli refers the reader explicitly to two works of Xenophon: the Cyropaedia, which he calls “the life of Cyrus” (la vita di Ciro; P 14; see also D 2.13); and the Hiero, which he calls by the alternate title, Of Tyranny (De tyrannide; D 2.2; see also the end of P 21).

In The Prince, Machiavelli lists Cyrus (along with Moses, Romulus, and Theseus) as one of the four “most excellent men” (P 6). He also names Cyrus—or least Xenophon’s version of Cyrus (D 3.22)—as the exemplar that Scipio Africanus imitates (P 14). Machiavelli says that whoever reads “the life of Cyrus” will see in the “life of Scipio” how much glory Scipio obtained as a result of imitating Cyrus. And he says that Scipio’s imitation consisted in the chastity, affability, humanity, and liberality outlined by Xenophon.

This kind and gentle vision of Cyrus was not shared universally by Renaissance Italians. Dante, Petrarch, and Boccaccio all characterize Cyrus as a monstrous ruler who was defeated and killed by Queen Tomyris (one of the stories of Cyrus’ demise which is related by Herodotus). Although Machiavelli at times offers information about Cyrus that is compatible with Herodotus’ account (P 6 and 26; AW 6.218), he appears to have a notable preference for Xenophon’s fictionalized version (as in P 14 above).

Machiavelli’s preference is presumably because of Xenophon’s teaching on appearances. Xenophon’s Cyrus is chaste, affable, humane, and liberal (P 14). At least two of these virtues are mentioned in later chapters of The Prince. Liberality is characterized as a virtue that consumes itself and thus cannot be maintained—unless one spends what belongs to others, as did “Cyrus, Caesar, and Alexander” (P 17). Similarly, humanity (umanità) is named as a trait that one may have to disavow in times of necessity (P 18). For example, Agathocles is characterized by inhumanity (inumanità; P8), and Hannibal was “inhumanely cruel” (inumana crudeltà; P 17; see also D 3.21-22). Nonetheless, humanity is also one of the five qualities that Machiavelli explicitly highlights as a useful thing to appear to have (P 18; see also FH 2.36). Machiavelli makes it clear that Xenophon’s Cyrus understood the need to deceive (D 2.13). Thus, Machiavelli may have learned from Xenophon that it is important for rulers (and especially founders) to appear to be something that they are not. This might hold true whether they are actual rulers (e.g., “a certain prince of present times” who says one thing and does another; P 18) or whether they are historical examples (e.g., Machiavelli’s altered story of David; P 13).

But it is worth wondering whether Machiavelli does in fact ultimately uphold Xenophon’s account. Immediately after praising Xenophon’s account of Cyrus at the end of Prince 14, Machiavelli in Prince 15 lambasts those who have presented imaginary objects of imitation. He says that he will leave out what is imagined and will instead discuss what is true. Could it be that Machiavelli puts Xenophon’s Cyrus forward as an example that is not to be followed? It is worth noting that Scipio, who imitates Cyrus, is criticized for excessive mercy (or piety; P 17). This example is especially remarkable since Machiavelli highlights Scipio as someone who was very rare (rarissimo) not only for his own times but “in the entire memory of things known” (in tutta la memoria delle cose che si fanno; P 17; compare FH 8.29). It also raises the question as to whether Machiavelli writes in a manner similar to Xenophon (D 3.22).

Lastly, it is worth noting that Xenophon was a likely influence on Machiavelli’s own fictionalized and stylized biography, The Life of Castruccio Castracani.

e. Lucretius

Ninth century manuscripts of De rerum natura, Lucretius’ poetic account of Epicurean philosophy, are extant. However, the text was not widely read in the Middle Ages and did not obtain prominence until centuries later, when it was rediscovered in 1417 by Poggio Bracciolini. It seems to have entered broader circulation in the 1430s or 1440s, and it was first printed in 1473. De rerum natura was one of the two texts which led to a revival of Epicurean philosophy in Machiavelli’s day, the other being the life of Epicurus from Book 10 of Diogenes Laertius’ Lives (translated into Latin in 1433). These two works, along with other snippets of Epicurean philosophy already known from Seneca and Cicero, inspired many thinkers—such as Ficino and Alberti—to ponder the return of these ideas.

With respect to Machiavelli, Lucretius was an important influence on Bartolomeo Scala, a lawyer who was a friend of Machiavelli’s father. Additionally, Lucretius was an important influence on Marcello di Virgilio Adriani, who was a professor at the University of Florence; Scala’s successor in the chancery; and the man under whom Machiavelli was appointed to work in 1498. Adriani deployed Lucretius in his Florentine lectures on poetry and rhetoric between 1494 and 1515. Machiavelli may have received a substantial part of his classical education from Adriani and was likely familiar with Adriani’s lectures, at least.

Lucretius also seems to have been a direct influence on Machiavelli himself. Although Machiavelli never mentions Lucretius by name, he did hand-copy the entirety of De rerum natura (drawing largely from the 1495 print edition). Machiavelli’s transcription was likely completed around 1497 and certainly before 1512. He omits the descriptive capitula—not original to Lucretius but common in many manuscripts—that subdivide the six books of the text into smaller sections. He also adds approximately twenty marginal annotations of his own, almost all of which are concentrated in Book 2. Machiavelli’s annotations focus on the passages in De rerum natura which concern Epicurean physics—that is, the way that the cosmos would function in terms of atomic motion, atomic swerve, free will, and a lack of providential intervention. Recent work has noted that it is precisely this section of the text that received the least attention from other Renaissance annotators, many of whom focused instead upon Epicurean views on love, virtue, and vice.

Recent work has also highlighted stylistic resonances between Machiavelli’s works and De rerum natura, either directly or indirectly. To give only one example, Machiavelli says in the Discourses that he desires to “take a path as yet untrodden by anyone” (non essendo suta ancora da alcuno trita) in order to find “new modes and orders” (modi ed ordini nuovi; D 1.pr). Lucretius says that he will walk paths not yet trodden (trita) by any foot in order to gather “new flowers” (novos flores; 4.1-5). Among other possible connections are P 25 and 26; and D 1.2, 2.pr, and 3.2.

Machiavelli does not seem to have agreed with the classical Epicurean position that one should withdraw from public life (e.g., D 1.26 and 3.2). But what might Machiavelli have learned from Lucretius? One possible answer concerns the soul. Machiavelli never treats the topic of the soul substantively, and he never uses the word at all in either The Prince or the Discourses (he apparently even went so far as to delete anima from a draft of the first preface to the Discourses). For Lucretius, the soul is material, perishable, and made up of two parts: animus, which is located in the chest, and anima, which is spread throughout the body. But each part, like all things in the cosmos, is composed only of atoms, invisibly small particles of matter that are constantly in motion. From time to time, these atoms conglomerate into macroscopic masses. Human beings are such entities. But when they perish, there is no longer any power to hold the atoms of the soul together, so those atoms disperse like all others eventually do.

A second possible aspect of Lucretian influence concerns the eternity of the cosmos, on the one hand, and the constant motion of the world, on the other. Lucretius seems to have believed that the cosmos was eternal but that the world was not, whereas some thinkers in Machiavelli’s day believed that both the cosmos and the world were eternal. Machiavelli ponders the question of the eternity of the world (D 2.5). He at times claims that the world has always remained the same (D 1.pr and 2.pr; see also 1.59). He also at times claims that worldly things are in motion (P 10 and FH 5.1; compare P 25) and that human things in particular are “always in motion” (D 1.6 and 2.pr).

As recent work has shown, reading Lucretius in the Renaissance was a dangerous game. By Machiavelli’s time, Petrarch had already described Epicurus as a philosopher who was held in popular disrepute; and Dante had already suggested that those who deny the afterlife belong with “Epicurus and all his followers” (Inferno 10.13-15). In 1513, the Fifth Lateran Council condemned those who believed that the soul was mortal; those who believed in the unity of the intellect; and those who believed in the eternity of the world. It also made belief in the afterlife mandatory. Lucretius was last printed in the Italian Renaissance in 1515 and was prohibited from being read in schools by the Florentine synod in late 1516 / early 1517.

f. Savonarola

There is no comprehensive monograph on Machiavelli and Savonarola. While there has been some interesting recent work, particularly with respect to Florentine institutions, the connection between the two thinkers remains a profitable area of research.

Girolamo Savonarola was a Dominican friar who came to Florence in 1491 and who effectively ruled the city from 1494 to 1498 from the pulpits of San Marco and Santa Reparata. He was renowned for his oratorical ability, his endorsement of austerity, and his concomitant condemnation of excess and luxury. The effectiveness of his message can be seen in the stark difference between Botticelli’s Primavera and his later, post-Savonarolan Calumny of Apelles; or in the fact that Michelangelo felt compelled to toss his own easel paintings onto the so-called bonfires of the vanities. Savonarola’s influence in Florentine politics grew to immensity, and Pope Alexander VI would eventually excommunicate Savonarola after a lengthy dispute. As a result, Florence would hang and then burn Savonarola (with two others) at the stake, going so far as to toss his ashes in the Arno afterward so that no relics of him could be kept.

Machiavelli attended several of Savonarola’s sermons, which may be significant since he did not seem inclined otherwise to attend services regularly. There are interesting possible points of contact in terms of the content of these sermons, such as Savonarola’s understanding of Moses; Savonarola’s prediction of Charles VIII as a new Cyrus; and Savonarola’s use of the Biblical story of the flood.

In The Prince, Machiavelli discusses Savonarola by name only a single time, saying that he is an “unarmed prophet” who has been ruined because he does not have a way either to make believers remain firm or to make unbelievers believe (P 6). Machiavelli later acknowledges that Savonarola spoke the truth when he claimed that “our sins” were the cause of Charles VIII’s invasion of Italy, although he does not name him and in fact disagrees with Savonarola as to which sins are relevant (P 12; compare D 2.18). In the Discourses, Machiavelli is more expansive and explicit in his treatment of the friar. Savonarola convinces the Florentines, no naïve people, that he talks with God (D 1.11); helps to reorder Florence but loses reputation after he fails to uphold a law that he fiercely supported (D 1.45); foretells the coming of Charles VIII into Florence (D 1.56); and understands what Moses understands, which is that one must kill envious men who oppose one’s plans (D 3.30). Machiavelli conspicuously omits any explicit mention of Savonarola in the Florentine Histories.

It is also worth noting two other important references in Machiavelli’s corpus. The lengthiest discussion of Savonarola is Machiavelli’s 9 March 1498 letter to Ricciardo Becchi. Many commentators have read this letter as a straightforward condemnation of Savonarola’s hypocrisy, but some recent work has stressed the letter’s rhetorical nuances. To give only one example, Machiavelli discusses how Savonarola colors his “lies” (bugie). While it is true that Machiavelli does use bugie only in a negative context in the Discourses (D 1.14 and 3.6), it is difficult to maintain that Machiavelli is opposed to lying in any principled way.

Secondly, in his 17 May 1521 letter to Francesco Guicciardini, Machiavelli has been interpreted as inveighing against Savonarola’s hypocrisy. But, again, nuances and context may be important. Machiavelli does indeed implicate two other friars: Ponzo for insanity and Alberto for hypocrisy. But he simply calls Savonarola versuto, which means something like “crafty” or “versatile” and which is a quality that he never denounces elsewhere in his corpus.

g. The Bible and Its Traditions

To what extent the Bible influenced Machiavelli remains an important question. He laments that histories are no longer properly read or understood (D 1.pr); speaks of reading histories with judicious attention (sensatamente; D 1.23); and implies that the Bible is a history (D 2.5). Furthermore, he explicitly speaks of reading the Bible in this careful manner (again sensatamente; D 3.30)—the only time in The Prince or the Discourses that he mentions “the Bible” (la Bibbia). Recent work has explored what it might have meant for Machiavelli to read the Bible in this way. Additionally, recent work has explored the extent to which Machiavelli engaged with the Jewish, Christian, and Islamic traditions.

Machiavelli quotes from the Bible only once in his major works, referring to someone “. . . who filled the hungry with good things and sent the rich away empty” (D 1.26; Luke 1:53; compare I Samuel 2:5-7). The passage is from Mary’s Magnificat and refers to God. Machiavelli, however, uses the passage to refer to David.

David is one of two major Biblical figures in Machiavelli’s works. Elsewhere in the Discourses, Machiavelli attributes virtue to David and says that he was undoubtedly a man very excellent in arms, learning, and judgment (D 1.19). In a digression in The Prince, Machiavelli refers to David as “a figure of the Old Testament” (una figura del Testamento vecchio; P 13). Machiavelli offers a gloss of the story of David and Goliath which differs in numerous and substantive ways from the Biblical account (see I Samuel 17:32-40, 50-51).

Moses is the other major Biblical figure in Machiavelli’s works. He is mentioned at least five times in The Prince (P 6 [4x] and 26) and at least five times in the Discourses (D 1.1, 1.9, 2.8 [2x], and 3.30). Moses is the only one of the four most excellent men of Chapter 6 who is said to have a “teacher” (precettore; compare Achilles in P 18). In the Discourses, Moses is a lawgiver who is compelled to kill “infinite men” due to their envy and in order to push his laws and orders forward (D 3.30; see also Exodus 32:25-28).

Machiavelli sparsely treats the “ecclesiastical principality” (P 11) and the “Christian pontificate” (P 11 and 19). He calls Ferdinand of Aragon “the first king among the Christians” (P 21) and says that Cosimo Medici’s death is mourned by “all citizens and all the Christian princes” (FH 7.6).

Chapter 6 of The Prince is famous for its distinction between armed and unarmed prophets. In Chapter 26, Machiavelli refers to extraordinary occurrences “without example” (sanza essemplo): the opening of the sea, the escort by the cloud, the water from the stone, and the manna from heaven. It has long been noted that Machiavelli’s ordering of these events does not follow the order given in Exodus (14:21, 13:21, 17:6, and 16:4, respectively). However, recent work has noted that it does in fact follow exactly the order of Psalms 78:13-24.

Lastly, scholars have recently begun to examine Machiavelli’s connections to Islam. For example, some scholars believe that Machiavelli’s notion of a sect (setta) is imported from the Averroeist vocabulary. Machiavelli speaks at least twice of the prophet Mohammed (FH 1.9 and 1.19), though conspicuously not when he discusses armed prophets (P 6). He discusses various Muslim princes—most importantly Saladin (FH 1.17), who is said to have virtue. Machiavelli compares the Pope with the Ottoman “Turk” and the Egyptian “Sultan” (P 19; compare P 11). He also compares “the Christian pontificate” with the Janissary and Mameluk regimes predominant under Sunni Islam (P 19; see also P 11). On occasion he refers to the Turks as “infidels” (infideli; e.g., P 13 and FH 1.17).

5. Contemporary Interpretations

The main aim of this article is to help readers find a foothold in the primary literature. A second, related aim is to help readers do so in the secondary literature.

In the spirit of bringing “common benefit to everyone” (D 1.pr), what follows is a rough outline of the scholarly landscape. It has followed the practice of many recent Machiavelli scholars—for whom it is not uncommon, especially in English, to say that the views on Machiavelli can be divided into a handful of camps. Many of the differences between these camps appear to reduce to the question of how to fit The Prince and the Discourses together. Five are outlined below, although some scholars would of course put that number either higher or lower. Readers who are interested in understanding the warp and woof of the scholarship in greater detail are encouraged to consult the recent and more fine-grained accounts of Catherine Zuckert (2017), John T. Scott (2016), and Erica Benner (2013).

The first camp takes The Prince to be a satirical or ironic work. The 16th century Italian jurist Alberico Gentili was one of the first interpreters to take up the position that The Prince is a satire on ruling. Rousseau and Spinoza in their own respective ways also seemed to hold this interpretation. Members of this camp typically argue that Machiavelli is a republican of various sorts and place special emphasis upon his rhetoric. The most notable recent member of this camp is Erica Benner (2017a, 2017b, 2013, and 2009), who argues that The Prince is thoroughly ironic and that Machiavelli presents a shocking moral teaching in order to subvert it.

The second camp also places emphasis upon Machiavelli’s republicanism and thus sits in proximity to the first camp. However, members of this camp do not typically argue that The Prince is satirical or ironic. They do typically argue that The Prince presents a different teaching than does the Discourses; and that, as an earlier work, The Prince is not as comprehensive or mature of a writing as the Discourses. This camp also places special emphasis upon Machiavelli’s historical context. The most notable member of this camp is Quentin Skinner (2017, 2010, and 1978). J. G. A. Pocock (2010 and 1975), Hans Baron (1988 and 1966), and David Wootton (2016) could be reasonably placed in this camp. Maurizio Viroli (2016, 2014, 2010, 2000, and 1998) could also be reasonably placed here, though he puts additional emphasis on The Prince.

The third camp argues for the unity of Machiavelli’s teaching and furthermore argues that The Prince and the Discourses approach the truth from different directions. In other words, members of this camp typically claim that Machiavelli presents the same teaching or vision in each book but from different starting points. The most notable members of this camp are Isaiah Berlin (1981 [1958]), Sheldon Wolin (1960), and Benedetto Croce (1925).

The fourth camp also argues for the unity of Machiavelli’s teaching and thus sits in proximity to the third camp. However, members of this camp do not typically argue that The Prince and Discourses begin from different starting points. And while they typically argue for the overall coherence of Machiavelli’s corpus, they do not appear to hold a consensus regarding the status of Machiavelli’s republicanism. The most notable member of this camp is Leo Strauss (1958). Harvey C. Mansfield (2017, 2016, 1998, and 1979), Catherine Zuckert (2017 and 2016), John T. Scott (2016, 2011, and 1994), Vickie Sullivan (2006, 1996, and 1994), Nathan Tarcov (2015, 2014, 2013a, 2013b, 2007, 2006, 2003, 2000, and 1982), and Clifford Orwin (2016 and 1978) could be reasonably placed here.

The fifth camp is hermeneutically beholden to Hegel, which seems at first glance to be an anachronistic approach. But Hegel’s notion of dialectic was itself substantially beholden to Proclus’ commentary on the Parmenides—a work which was readily available to Machiavelli through Ficino’s translation and which was enormously influential on Renaissance Platonism in general. The most notable member of this camp is Claude Lefort (2012 [1972]). Miguel Vatter (2017, 2013, and 2000) could be reasonably placed here and additionally deserves mention for his familiarity with the secondary literature in Spanish (an unusual achievement for Machiavelli scholars who write in English). Additionally, interpreters who are indirectly beholden to Hegel’s dialectic, via Marx, could also be reasonably placed here. Miguel Abensour (2011 [2004]), Louis Althusser (1995), and Antonio Gramsci (1949) are examples.

6. References and Further Reading

Below are listed some of the more well-known works in the scholarship, as well as some that the author has found profitable but which are perhaps not as well-known. They are arranged as much as possible in accordance with the outline of this article. Given the article’s aim, the focus is almost exclusively upon works that are available in English. It goes without saying that there are many important books that are not mentioned.

Regarding Machiavelli’s life, there are many interesting and recent biographies. Some examples include Benner (2017a), Celenza (2015), Black (2013 and 2010), Atkinson (2010), Skinner (2010), Viroli (2010, 2000, and 1998), de Grazia (1989), and Ridolfi (1964). Vivanti (2013) offers an intellectual biography. Pesman (2010) captures Machiavelli’s work for the Florentine republic. Butters (2010), Cesati (1999), and Najemy (1982) discuss Machiavelli’s relationship with the Medici. Landon (2013) examines Machiavelli’s relationship with Lorenzo di Filippo Strozzi. Masters (1999 and 1998) examines Machiavelli’s relationship with Leonardo da Vinci.

For an understanding of Machiavelli’s overall position, Zuckert (2017) is the most recent and comprehensive account of Machiavelli’s corpus, especially with respect to his politics. Other good places to begin are Nederman (2009), Viroli (1998), Mansfield (2017, 2016, and 1998), Skinner (2017 and 1978), Prezzolini (1967), Voegelin (1951), and Foster (1941). Johnston, Urbinati, and Vergara (2017) and Fuller (2016) are recent, excellent collections. Lefort (2012) and Strauss (1958) are daunting and difficult but also well worth the attempt.

Skinner (2017), Benner (2009), and Mansfield (1998) discuss virtue. Spackman (2010) and Pitkin (1984) discuss fortune, particularly with respect to the image of fortune as a woman. Saxonhouse (2016), Tolman Clarke (2005), and Falco (2004) discuss Machiavelli’s understanding of women. Benner (2017b and 2009) and Cox (2010) treat Machiavelli’s ethics.

On religion, see Parsons (2016), Tarcov (2014), Palmer (2010a and 2010b), Lynch (2010), and Lukes (1984). Biasiori and Marcocci (2018) is a recent collection concerning Machiavelli and Islam. Nederman (1999) examines free will. Blanchard (1996) discusses sight and touch.

Rahe (2017) and Parel (1992) discuss Machiavelli’s understanding of humors. Regarding various other political themes, including republicanism, see McCormick (2011), Slade (2010), Barthas (2010), Rahe (2017, 2008, and 2005), Patapan (2006), Sullivan (2006 and 1996), Forde (1995 and 1992), Bock (1990), Hulliung (1983), Skinner (1978), and Pocock (1975).

Recent works concerning The Prince include Benner (2017b and 2013), Scott (2016), Parsons (2016), Viroli (2014), Vatter (2013), Rebhorn (2010 and 1998), M. Palmer (2001), and de Alvarez (1999). Tarcov’s essays (2015, 2014, 2013a, 2013b, 2007, 2006, 2003, 2000, and 1982) are especially fine-grained analyses. Connell (2013) discusses The Prince’s composition. On deception, see Dietz (1984) and Langton and Dietz (1987). On Cesare Borgia, see Orwin (2016) and Scott and Sullivan (1994).

Recent works concerning the Discourses include Duff (2011), Najemy (2010), Pocock (2010), Hörnqvist (2004), Vatter (2000), Coby (1999), and Sullivan (1996). Mansfield (1979) and Walker (1950) are the two notable commentaries.

Regarding the Art of War, see Hörnqvist (2010), Lynch (2010 and 2003), Lukes (2004), and Colish (1998).

Regarding the Florentine Histories, see McCormick (2017), Jurdjevic (2014), Lynch (2012), Cabrini (2010), and Mansfield (1998).

Regarding Machiavelli’s poetry and plays, see Ascoli and Capodivacca (2010), Martinez (2010), Kahn (2010 and 1994), Atkinson and Sices (2007 [1985]), Patapan (2003), Sullivan (2000), and Ascoli and Kahn (1993).

Anyone who wants to learn more about the intellectual context of the Italian Renaissance should begin with the many writings of Kristeller (e.g., 1979, 1961, and 1965), whose work is a model of scholarship. See also Hankins (2000), Cassirer (2010 [1963]), and Burke (1998).

Regarding humanist educational treatises, see Kallendorf (2008). Regarding Ficino, see the I Tatti series edited by James Hankins (especially 2015, 2012, 2008, and 2001). Hankins’ examination of the “myth” of the Platonic Academy in Florence is also worth mentioning (1991). Regarding Xenophon, see Nadon (2001) and Newell (1988). Regarding Lucretius, see A. Palmer (2014), Brown (2010a and 2010b), and Rahe (2008). Norbrook, Harrison, and Hardie (2016) is a recent collection concerning Lucretius’ influence upon early modernity. The most comprehensive recent treatment of Savonarola can be found in Jurdjevic (2014).

Much of Machiavelli’s important personal correspondence has been collected in Atkinson and Sices (1996). Najemy has examined Machiavelli’s correspondence with Vettori (1993).

Those interested in the Italian scholarship should begin with the seminal work of Sasso (1993, 1987, and 1967). Careful studies of Machiavelli’s word choice can be found in Chiappelli (1974, 1969, and 1952).

Lastly, Ruffo-Fiore (1990) has compiled an annotated bibliography of Machiavelli scholarship from 1935 to 1988.

a. Primary Sources

  • Machiavelli, Niccolò. The Art of War, ed. and trans. Christopher Lynch. Chicago: University of Chicago Press, 2003.
  • Machiavelli, Niccolò. L’Arte della guerra; scritti politici minori, ed. Jean-Jacques Marchand, Denis Fachard, and Giorgio Masi. Rome: Salerno Editrice, 2001.
  • Machiavelli, Niccolò. The Chief Works and Others. Three volumes, trans. Allan Gilbert. Durham: Duke University Press, 1999 [1958].
  • Machiavelli, Niccolò. Clizia, trans. Daniel T. Gallagher. Long Grove: Waveland Press, 1996.
  • Machiavelli, Niccolò. The Comedies of Machiavelli, ed. and trans. David Sices and James B. Atkinson. Indianapolis: Hackett, 2007 [1985].
  • Machiavelli, Niccolò. Discourses on Livy, trans. Harvey C. Mansfield and Nathan Tarcov. Chicago: University of Chicago Press, 1998 [1996].
  • Machiavelli, Niccolò. Discorsi sopra la prima deca di Tito Livio, ed. Giorgio Inglese. Milano: Bur Rizzoli, 1984. Digitized 2011.
  • Machiavelli, Niccolò. Florentine Histories, trans. Laura F. Banfield and Harvey C. Mansfield. Princeton: Princeton University Press, 1988.
  • Machiavelli, Niccolò. Machiavelli and Friends: Their Personal Correspondence, ed. and trans. James B. Atkinson and David Sices. DeKalb: Northern Illinois University Press, 1996.
  • Machiavelli, Niccolò. Mandragola, trans. Mera J. Flaumenhaft. Long Grove: Waveland Press, 1981.
  • Machiavelli, Niccolò. The Prince with Related Documents, trans. and ed. William J. Connell. Boston: Bedford / St. Martin’s Press, 2005.
  • Machiavelli, Niccolò. The Prince, second edition, trans. Harvey C. Mansfield. Chicago: University of Chicago Press, 1998.
  • Machiavelli, Niccolò. Il Principe, ed. Giorgio Inglese. Torino: Giulio Einaudi, 2013.Machiavelli, Niccolò. Tutte le opere. Florence: Sansoni, 1971.

b. Secondary Sources

  • Abensour, Miguel. Democracy Against the State: Marx and the Machiavellian Moment. Cambridge: Polity Press, 2011 [2004]).
  • Alberti, Leon Battista. On Painting. New Haven: Yale University Press, 1966 [1956].
  • Althusser, Louis. “Machiavel et nous.” In crits philosophiques et politiques, 42-168. Paris: Stock / IMEC, 1995.
  • Arendt, Hannah. The Human Condition, second edition. Chicago: University of Chicago Press, 1998 [1958].
  • Ascoli, Albert Russell, and Angela Matilde Capodivacca. “Machiavelli and Poetry.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 190-205. Cambridge: Cambridge University Press, 2010.
  • Ascoli, Albert Russell, and Victoria Kahn, eds. Machiavelli and the Discourse of Literature. Ithaca: Cornell University Press, 1993.
  • Atkinson, James B. “Niccolò Machiavelli: A Portrait.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 14-30. Cambridge: Cambridge University Press, 2010.
  • Baron, Hans. In Search of Florentine Civic Humanism. Princeton: Princeton University Press, 1988.
  • Baron, Hans. The Crisis of the Early Italian Renaissance. Princeton: Princeton University Press, 1966.
  • Barthas, Jérémie. “Machiavelli in political thought from the age of revolutions to the present.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 256-273. Cambridge: Cambridge University Press, 2010.
  • Benner, Erica. Be Like the Fox: Machiavelli’s Lifelong Quest for Freedom. New York: W.W. Norton & Company, 2017a.
  • Benner, Erica. “The Necessity to Be Not-Good: Machiavelli’s Two Realisms.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 164-185. Chicago: University of Chicago Press, 2017b.
  • Benner, Erica. Machiavelli’s Prince: A New Reading. Oxford: Oxford University Press, 2013.
  • Benner, Erica. Machiavelli’s Ethics. Princeton: Princeton University Press, 2009.
  • Berlin, Isaiah. “The Originality of Machiavelli.” In Against the Current: Essays in the History of Ideas, 25-79. Oxford: Oxford University Press, 1981 [1958].
  • Biasiori, Lucio, and Giuseppe Marcocci, eds. Machiavelli, Islam and the East: Reorienting the Foundations of Modern Political Thought. London: Palgrave Macmillan, 2018.
  • Blanchard, Kenneth C. “Being, Seeing, and Touching: Machiavelli’s Modification of Platonic Epistemology.” The Review of Metaphysics 49, no. 3 (1996): 577-607.
  • Black, Robert. Machiavelli. London: Routledge, 2013.
  • Black, Robert. “Machiavelli in the Chancery.” In The Cambridge Companion to Machiavelli, 31-47. Edited by John M. Najemy. Cambridge: Cambridge University Press, 2010.
  • Bock, Gisela, Quentin Skinner, and Maurizio Viroli, eds. Machiavelli and Republicanism. Cambridge: Cambridge University Press, 1990.
  • Brown, Alison. “Philosophy and Religion in Machiavelli.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 157-172. Cambridge: Cambridge University Press, 2010a.
  • Brown, Alison. The Return of Lucretius to Renaissance Florence. Cambridge: Harvard University Press, 2010b.
  • Burke, Peter. The European Renaissance: Centres and Peripheries. Oxford: Blackwell, 1998.
  • Butters, Humfrey. “Machiavelli and the Medici.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 64-79. Cambridge: Cambridge University Press, 2010.
  • Cabrini, Anna Maria. “Machiavelli’s Florentine Histories.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 128-143. Cambridge: Cambridge University Press, 2010.
  • Cassirer, Ernst. The Individual and the Cosmos in Renaissance Philosophy. Chicago: University of Chicago Press, 2010 [1963].
  • Celenza, Christopher S. Machiavelli: A Portrait. Cambridge: Harvard University Press, 2015.
  • Cesati, Franco. The Medici. Florence: Mandragora, 1999.
  • Chabod, Federico. Machiavelli and the Renaissance, trans. David Moore. London: Bowes and Bowes, 1960.
  • Chiappelli, Fredi. Machiavelli e La ‘Lingua Fiorentina.’ Bologna: Massimiliano Boni, 1974.
  • Chiappelli, Fredi. Nuovi Studi sul Linguaggio del Machiavelli. Florence: Le Monnier, 1969.
  • Chiappelli, Fredi. Studi sul Linguaggio del Machiavelli. Florence: Le Monnier, 1952.
  • Clarke, Michelle Tolman. “On the Woman Question in Machiavelli.” The Review of Politics 67, no. 2 (2005): 229-256.
  • Coby, Patrick. Machiavelli’s Romans. Lanham: Lexington Books, 1999.
  • Colish, Marcia L. “Machiavelli’s Art of War: A Reconsideration.” Renaissance Quarterly 51, no. 4 (1998): 1151-1168.
  • Connell, William J. “Dating The Prince: Beginnings and Endings.” The Review of Politics 75, no. 4 (2013): 497-514.
  • Cox, Virginia. “Rhetoric and Ethics in Machiavelli.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 173-189. Cambridge: Cambridge University Press, 2010.
  • Croce, Benedetto. Elementi di Politica. Bari: Laterza, 1925.
  • De Alvarez, Leo Paul. The Machiavellian Enterprise: A Commentary on The Prince. DeKalb: Northern Illinois University Press, 2008 [1999].
  • De Grazia, Sebastian. Machiavelli in Hell. Princeton: Princeton University Press, 1989.
  • Dietz, Mary. “Trapping the Prince: Machiavelli and the Politics of Deception.” The American Political Science Review 80, no. 3 (1986): 777-799.
  • Duff, Alexander S. “Republicanism and the Problem of Ambition: The Critique of Cicero in Machiavelli’s Discourses.” The Journal of Politics 73, No. 4 (2011): 980-992.
  • Falco, Maria J., ed. Feminist Interpretations of Machiavelli. University Park: Penn State University Press, 2004.
  • Ficino, Marsilio. On Dionysius the Areopagite Volume 1, ed. and trans. Michael J.B. Allen. Cambridge: Harvard University Press, 2015.
  • Ficino, Marsilio. Commentaries on Plato, Volume 2, Part 1, ed. and trans. Maude Vanhaelen. Cambridge: Harvard University Press, 2012.
  • Ficino, Marsilio. Commentaries on Plato, Volume 1, ed. and trans. Michael J. B. Allen. Cambridge: Harvard University Press, 2008.
  • Ficino, Marsilio. Platonic Theology, Volume 1, ed. James Hankins and William Bowen and trans. Michael J. B. Allen. Cambridge: Harvard University Press, 2001.
  • Forde, Steven. “International Realism and the Science of Politics: Thucydides, Machiavelli, and Neorealism.” International Studies Quarterly 39, no. 2 (1995): 141-160.
  • Forde, Steven. “Varieties of Realism: Thucydides and Machiavelli.” The Journal of Politics 54, no. 2 (1992): 372-393.
  • Foster, Michael. Masters of Political Thought, Volume 1: Plato to Machiavelli. Boston: Houghton Mifflin Company, 1941.
  • Fuller, Timothy, ed. Machiavelli’s Legacy: The Prince After Five Hundred Years. Philadelphia: University of Pennsylvania Press, 2015.
  • Gilbert, Allan H. Machiavelli’s Prince and Its Forerunners. Durham: Duke University Press, 1938.
  • Gilbert, Felix. Machiavelli and Guicciardini: Politics and History in Sixteenth-Century Florence. New York: W.W. Norton & Company, 1984.
  • Gilbert, Felix. History, Choice, and Commitment. Cambridge: The Belknap Press, 1977.
  • Gramsci, Antonio. Note sul Machiavelli, sulla politica e sullo stato moderno. Torino: Einaudi, 1949.
  • Hankins, James, ed. Renaissance Civic Humanism: Reappraisals and Reflections. Cambridge: Cambridge University Press, 2000.
  • Hankins, James. “The Myth of the Platonic Academy of Florence.” Renaissance Quarterly 44, no. 3 (1991): 429-475.
  • Hörnqvist, Mikael. “Machiavelli’s Military Project and the Art of War.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 112-127. Cambridge: Cambridge University Press, 2010.
  • Hörnqvist, Mikael. Machiavelli and Empire. Cambridge: Cambridge University Press, 2004.
  • Hulliung, Mark. Citizen Machiavelli. Princeton: Princeton University Press, 1983.
  • Jurdjevic, Mark. A Great and Wretched City: Promise and Failure in Machiavelli’s Florentine Political Thought. Cambridge: Harvard University Press, 2014.
  • Kahn, Victoria. “Machiavelli’s Afterlife and Reputation to the Eighteenth Century.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 239-255. Cambridge: Cambridge University Press, 2010.
  • Kahn, Victoria. Machiavellian Rhetoric: From the Counter-Reformation to Milton. Princeton: Princeton University Press, 1994.
  • Kallendorf, Craig W., ed. and trans. Humanist Educational Treatises. Cambridge: Harvard University Press, 2008 [2002].
  • Kristeller, Paul Oskar. Renaissance Thought and Its Sources, ed. Michael Mooney. New York: Columbia University Press, 1979.
  • Kristeller, Paul Oskar. Renaissance Thought II: Papers on Humanism and the Arts. New York: Harper and Row, 1965.
  • Kristeller, Paul Oskar. Renaissance Thought: The Classic, Scholastic, and Humanist Strains. New York: Harper and Row, 1961.
  • Landon, William J. Lorenzo de Filippo Strozzi and Niccoló Machiavelli. Toronto: University of Toronto Press, 2013.
  • Langton, John, and Mary Dietz. “Machiavelli’s Paradox: Trapping or Teaching the Prince.” The American Political Science Review 81, no. 4 (1987): 1277-1288.
  • Lukes, Timothy J. “Martialing Machiavelli: Reassessing the Military Reflections.” Journal of Politics 66, no. 4 (2004): 1089-1108.
  • Lukes, Timothy J. “Lionizing Machiavelli.” The American Political Science Review 95, no. 3 (2001): 561-75.
  • Lukes, Timothy J. “To Bamboozle With Goodness: The Political Advantages of Christianity in the Thought of Machiavelli.” Renaissance and Reformation 8, no. 4 (1984): 266-77.
  • Lynch, Christopher. “War and Foreign Affairs in Machiavelli’s Florentine Histories.” The Review of Politics 74, no. 1 (2012): 1-26.
  • Lynch, Christopher. “The Ordine Nuovo of Machiavelli’s Arte della Guerra: Reforming Ancient Matter.” History of Political Thought 31, no. 3 (2010): 407-425.
  • Lynch, Christopher. “Machiavelli on Reading the Bible Judiciously.” Hebraic Political Studies 1, no. 2 (2006): 162-185.
  • Lefort, Claude. Machiavelli in the Making, trans. Michael B. Smith. Evanston: Northwestern University Press, 2012.
  • Major, Rafael. “A New Argument for Morality: Machiavelli and the Ancients.” Political Research Quarterly 60, no. 2 (2007): 171-179.
  • Mansfield, Harvey C. “Machiavelli on Necessity.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 39-57. Chicago: University of Chicago Press, 2017.
  • Mansfield, Harvey C. “Machiavelli’s Enterprise.” In Machiavelli’s Legacy, ed. Timothy Fuller, 11-33. Philadelphia: University of Pennsylvania Press, 2016.
  • Mansfield, Harvey C. Machiavelli’s Virtue. Chicago: University of Chicago Press, 1998 [1996].
  • Mansfield, Harvey C. Machiavelli’s New Modes and Orders: A Study of the Discourses on Livy. Chicago: University of Chicago Press, 1979.
  • Martinez, Ronald L. “Comedian, Tragedian: Machiavelli and Traditions of Renaissance Theater.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 206-222. Cambridge: Cambridge University Press, 2010.
  • Masters, Roger D. Fortune is a River: Leonardo da Vinci and Niccoló Machiavelli’s Magnificent Dream to Change the Course of Florentine History. New York: Free Press, 1999.
  • Masters, Roger D. Machiavelli, Leonardo, and the Science of Power. Notre Dame: University of Notre Dame Press, 1998.
  • McCormick, John P. “On the Myth of a Conservative Turn in Machiavelli’s Florentine Histories.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 330-351. Chicago: University of Chicago Press, 2017.
  • McCormick, John P. Machiavellian Democracy. Cambridge: Cambridge University Press, 2011.
  • Nadon, Christopher. Xenophon’s Prince: Republic and Empire in the Cyropaedia. Berkeley: University of California Press, 2001.
  • Najemy, John A. “Society, Class, and State in Machiavelli’s Discourses on Livy.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 96-111. Cambridge: Cambridge University Press, 2010.
  • Najemy, John M. Between Friends: Discourses of Power and Desire in the Machiavelli-Vettori Letters of 1513-1515. Princeton: Princeton University Press, 1993.
  • Nederman, Cary J. Machiavelli: A Beginner’s Guide. London: Oneworld, 2009.
  • Nederman, Cary J. “Amazing Grace: Fortune, God, and Free Will in Machiavelli’s Thought.” Journal of the History of Ideas 60, no. 4 (1999): 617-638.
  • Newell, Waller R. Tyranny: A New Interpretation. Cambridge: Cambridge University Press, 2013.
  • Newell, Waller R. “Machiavelli and Xenophon on Princely Rule: A Double-Edged Encounter.” The Journal of Politics 50, no. 1 (1988): 108-130.
  • Norbrook, David, Stephen Harrison, and Philip Hardie, eds. Lucretius and the Early Modern. Oxford: Oxford University Press, 2016.
  • Orwin, Clifford. “The Riddle of Cesare Borgia and the Legacy of Machiavelli’s Prince.” In Machiavelli’s Legacy, ed. Timothy Fuller, 156-170. Philadelphia: University of Pennsylvania Press, 2016.
  • Orwin, Clifford. “Machiavelli’s Unchristian Charity.” The American Political Science Review 72, no. 4 (1978): 1217-1228.
  • Palmer, Ada. Reading Lucretius in the Renaissance. Cambridge: Harvard University Press, 2014.
  • Palmer, Michael. Masters and Slaves: Revisioned Essays in Political Philosophy. Lanham: Lexington Books, 2001.
  • Parel, Anthony J. The Machiavellian Cosmos. New Haven: Yale University Press, 1992.
  • Parsons, William B. Machiavelli’s Gospel: The Critique of Christianity in The Prince. Rochester: University of Rochester Press, 2016.
  • Patapan, Haig. Machiavelli in Love: The Modern Politics of Love and Fear. Lanham: Lexington Books, 2007.
  • Patapan, Haig. “I Capitoli: Machiavelli’s New Theogony.” The Review of Politics 65, no. 2 (2003): 185-207.
  • Pesman, Roslyn. “Machiavelli, Piero Soderini, and the Republic of 1494-1512.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 48-63. Cambridge: Cambridge University Press, 2010.
  • Pitkin, Hanna Fenichel. Fortune is a Woman: Gender and Politics in the Thought of Niccolò Machiavelli. Berkeley: University of California Press, 1984.
  • Pocock, J. G. A. “Machiavelli and Rome: The Republic as Ideal and as History.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 144-156. Cambridge: Cambridge University Press, 2010.
  • Pocock, J. G. A. The Machiavellian Moment: Florentine Political Thought and the Atlantic Republican Tradition. Princeton: Princeton University Press, 1975.
  • Prezzolini, Giuseppe. Machiavelli. New York: Farrar, Straus and Giroux, 1967.
  • Ruffo-Fiore, Silvia. Niccolò Machiavelli: An Annotated Bibliography of Modern Criticism and Scholarship. New York: Greenwood Press, 1990.
  • Rahe, Paul A. “Machiavelli and the Modern Tyrant.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 207-233. Chicago: University of Chicago Press, 2017.
  • Rahe, Paul A. Against Throne and Altar: Machiavelli and Political Theory under the English Republic. Cambridge: Cambridge University Press, 2008.
  • Rahe, Paul A., ed. Machiavelli’s Liberal Republican Legacy. Cambridge: Cambridge University Press, 2005.
  • Rebhorn, Wayne A. “Machiavelli’s Prince in the Epic Tradition.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 80-95. Cambridge: Cambridge University Press, 2010.
  • Rebhorn, Wayne A. Foxes and Lions: Machiavelli’s Confidence Men. Ithaca: Cornell University Press, 1988.
  • Ridolfi, Roberto. The Life of Niccolò Machiavelli, trans. Cecil Grayson. Chicago: University of Chicago Press, 1964.
  • Sasso, Gennaro. Niccolò Machiavelli. Bologna: Il Mulino, 1993.
  • Sasso, Gennaro. Machiavelli e gli antichi e altri saggi. Milan: Ricciardi, 1987.
  • Sasso, Gennaro. Studi su Machiavelli. Naples: Morano, 1967.
  • Savonarola, Girolamo. Apologetic Writings, ed. and trans. M. Michèle Mulchahey. Cambridge: Harvard University Press, 2015.
  • Savonarola, Girolamo. Trattato sul Governo di Firenze. Florence: Franco Cesati Editore, 2006.
  • Savonarola, Girolamo. Selected Writings of Girolamo Savonarola: Religion and Politics, 1490-1498, ed. and trans. Anne Borelli and Maria Pastore Passoro. New Haven: Yale University Press, 2006.
  • Savonarola, Girolamo. Prison Meditations on Psalms 51 and 31, ed. and trans. John Patrick Donnelly. Milwaukee, Marquette Press, 2011 [1994].
  • Savonarola, Girolamo. The Triumph of the Cross. London: Sands and Co., 1901.
  • Saxonhouse, Arlene W. “Machiavelli’s Women.” In Machiavelli’s Legacy, ed. Timothy Fuller, 70-86. Philadelphia: University of Pennsylvania Press, 2016.
  • Scott, John T. The Routledge Guidebook to Machiavelli’s The Prince. London: Routledge, 2016.
  • Scott, John T., and Vickie B. Sullivan. “Patricide and the Plot of The Prince: Cesare Borgia and Machiavelli’s Italy.” The American Political Science Review 88, no. 4 (1994): 887-900.
  • Skinner, Quentin. “Machiavelli and the Misunderstanding of Princely Virtù.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 139-163. Chicago: University of Chicago Press, 2017.
  • Skinner, Quentin. Machiavelli. New York: Sterling Publishing, 2010 [1981].
  • Skinner, Quentin. The Renaissance, vol. 1 of The Foundations of Modern Political Thought. Cambridge: Cambridge University Press, 1978.
  • Slade, Francis. “Two Versions of Political Philosophy: Teleology and the Conceptual Genesis of the Modern State.” In Natural Moral Law in Contemporary Society, ed. Holger Zaborowski, 235-263. Washington, D.C.: The Catholic University of America Press, 2010.
  • Spackman, Barbara. “Machiavelli and Gender.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 223-238. Cambridge: Cambridge University Press, 2010.
  • Strauss, Leo. Thoughts on Machiavelli. Chicago: University of Chicago Press, 1978 [1958].
  • Sullivan, Vickie B. Machiavelli, Hobbes, and the Formation of a Liberal Republicanism in England. Cambridge: Cambridge University Press, 2006.
  • Sullivan, Vickie B., ed. The Comedy and Tragedy of Machiavelli. New Haven: Yale University Press, 2000.
  • Sullivan, Vickie B. Machiavelli’s Three Romes. DeKalb: Northern Illinois University Press, 1996.
  • Tarcov, Nathan. “Machiavelli’s Humanity.” In In Search of Humanity: Essays in Honor of Clifford Orwin, ed. Andrea Radasanu, 177-186. Lanham: Lexington Books, 2015.
  • Tarcov, Nathan. “Machiavelli’s Critique of Religion.” Social Research 81, no. 1 (2014): 193-216.
  • Tarcov, Nathan. “Machiavelli in The Prince: His Way of Life in Question.” In Political Philosophy Cross-Examined: Perennial Challenges to the Philosophic Life. Essays in Honor of Heinrich Meier. ed. Thomas L. Pangle and J. Harvey Lomax, 101-118. New York: Palgrave Macmillan, 2013a.
  • Tarcov, Nathan. “Belief and Opinion in Machiavelli’s Prince.” The Review of Politics 75, no. 4 (2013b): 573-586.
  • Tarcov, Nathan. “Freedom, Republics, and Peoples in Machiavelli’s Prince.” In Freedom and the Human Person, ed. Richard Velkley, 122-142. Washington, D.C.: Catholic University of America Press, 2007.
  • Tarcov, Nathan. “Law and Innovation in Machiavelli’s Prince.” In Enlightening Revolutions: Essays in Honor of Ralph Lerner, ed. Svetozar Minkov, 77-90. Lanham: Lexington Books, 2006.
  • Tarcov, Nathan. “Arms and Politics in Machiavelli’s Prince.” In Entre Kant et Kosovo: Études offertes … Pierre Hassner, ed. Anne-Marie Le Gloannec et Aleksander Smolar, 109-121. Paris: Presses de la Fondation Nationale des Sciences Politiques, 2003.
  • Tarcov, Nathan. “Machiavelli and the Foundations of Modernity: A Reading of Chapter 3 of The Prince.” In Educating the Prince: Essays in Honor of Harvey Mansfield, ed. Mark Blitz and William Kristol, 30-44. Lanham: Rowman and Littlefield, 2000.
  • Tarcov, Nathan. “Quentin Skinner’s Method and Machiavelli’s Prince.” Ethics 92, no. 4 (1982): 692-709.
  • Vatter, Miguel. “Machiavelli, Ancient Theology, and the Problem of Civil Religion.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 113-137. Chicago: University of Chicago Press, 2017.
  • Vatter, Miguel. Machiavelli’s The Prince. London: Bloomsbury, 2013.
  • Vatter, Miguel. Between Form and Event: Machiavelli’s Theory of Political Freedom. New York: Fordham University Press, 2014 [2000].
  • Viroli, Maurizio. “The Redeeming Prince.” In Machiavelli’s Legacy, ed. Timothy Fuller, 34-53. Philadelphia: University of Pennsylvania Press, 2016.
  • Viroli, Maurizio. Redeeming The Prince: The Meaning of Machiavelli’s Masterpiece. Princeton: Princeton University Press, 2014.
  • Viroli, Maurizio. Machiavelli’s God. Princeton: Princeton University Press, 2010.
  • Viroli, Maurizio. Niccolò’s Smile: A Biography of Machiavelli. New York: Farrar, Straus and Giroux, 2000.
  • Viroli, Maurizio. Machiavelli. New York: Oxford University Press, 1998.
  • Vivanti, Corrado. Machiavelli: An Intellectual Biography, trans. Simon MacMichael. Princeton: Princeton University Press, 2013.
  • Voegelin, Eric. “Machiavelli’s Prince: Background and Formation.” The Review of Politics 13, no. 2 (1951): 142-168.
  • Walker, Leslie J. The Discourses ofNiccolò Machiavelli, two volumes. London, 1975 [1950].
  • Warner, John M., and John T. Scott. “Sin City: Augustine and Machiavelli’s Reordering of Rome.” The Journal of Politics 73, no. 3 (August 2011): 857-871.
  • Wolin, Sheldon. Politics and Vision. Princeton: Princeton University Press, 2004 [1960].
  • Wootton, David. “Machiavelli and the Business of Politics.” In Machiavelli’s Legacy, ed. Timothy Fuller, 87-104. Philadelphia: University of Pennsylvania Press, 2016.
  • Zuckert, Catherine. Machiavelli’s Politics. Chicago: University of Chicago Press, 2017.
  • Zuckert, Catherine. “Machiavelli’s Revolution in Thought.” In Machiavelli’s Legacy, ed. Timothy Fuller, 54-69. Philadelphia: University of Pennsylvania Press, 2016.

 

Author Information

Kevin Honeycutt
Email: honeycutt_ks@mercer.edu
Mercer University
U. S. A.

Catharine Trotter Cockburn (1679?—1749)

CockburnCatharine Trotter Cockburn was an active contributor to early modern philosophical discourse in England, especially regarding morality. Her philosophical production was primarily in defense of John Locke and Samuel Clarke. Nevertheless, her thinking was original and independent in many respects.

Cockburn’s moral philosophy combines elements of Locke’s epistemology with Clarke’s fitness theory, and its central axiom is that the true ground of morality consists in human nature. She argued that since all human beings are naturally provided with reason, moral obligation rests on the conformity of God’s command to our own reason. According to her anti-voluntarist moral view, the will of God does not lay the foundations of morality, but it gives morality the force of a law. Furthermore, Cockburn maintained that Man is naturally inclined towards sociability and is consequently morally obliged to contribute to the good and preservation of society. This is one of the most distinctive of Cockburn’s ideas, which departs from a strictly Lockean moral view.

Cockburn entertained a universal and anti-dogmatic idea of the Christian religion founded on the essentials of human nature being reason and sociability. In her view, since there is not an absolutely perfect communion, everyone can choose the one she or he judges as the best. Churches should not waste time presuming to be infallible; rather, they should aim at satisfying their adherents by teaching those truths necessary for salvation. Thus, she converted to the Church of England from Catholicism.

Although mainly focused on morality, Cockburn also dealt with some metaphysical issues that often connect to it, particularly the nature of the soul and the reality of space. Regarding the former, she inquired whether the soul is material or spiritual, concluding that although it is probably immaterial, there is no evidence against either its immateriality or the possibility of its being thinking matter. Moreover, while she defended Locke’s position that only consciousness makes personal identity, Cockburn also gave an original mode-based interpretation of Locke’s view on personhood. As regards the reality of space, she rejected Edmund Law’s position against Clarke that space is only an abstract idea. On the contrary, she argued that space is a real being that can fill up the abyss between body and spirit since it partakes of the nature of both.

Table of Contents

  1. Life
  2. Moral Philosophy
    1. The True Grounds of Morality
    2. Moral Obligation
  3. Religion
  4. Metaphysical Issues
    1. The Nature of the Soul
    2. The Reality of Space
  5. Originality
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Catharine Trotter was born in London probably on August 16, 1679. This is the date provided by Thomas Birch (1705-1766), her official biographer and the editor of the collection of her posthumous Works (1751). However, her birthdate has been recently questioned by Anne Kelley, who found an entry of baptism for “Katherine Trotters, daughter of David Trotters, gentleman, and his wife Sarah” for August 29, 1674, in the Register of St Andrew, Holborn (Kelley 2002, 1). Catharine was the younger daughter of Captain David Trotter, commodore in the Royal Navy, and Mrs. Sarah Ballenden. According to the inscription on Catharine’s gravestone in the cemetery of Longhorsley, she died on May 11, 1749, “in the 70th year of her age.” This seems to confirm the date proposed by Birch as her most probable birthdate.

After her father’s death in 1683, King Charles II granted a pension to her family that was barely sufficient for survival. Little is known of Catharine Trotter’s life until 1701. Birch tells that as a child, she taught herself French and received help in learning Latin, grammar, and logic. At the age of sixteen, she started her career as a playwright. From 1695 to 1706, she wrote and published five plays: Agnes de Castro (1695), The Fatal Friendship (1698), Love at a Loss (1700), The Unhappy Penitent (1701), and The Revolution of Sweden (1706), which were well received and repeatedly performed.

Between 1701 and 1703, her family moved to Salisbury, where Catharine found the favor of Elizabeth Burnet, the wife of the bishop Gilbert Burnet. There, she devoted herself to studying John Locke’s philosophy. In 1702 she anonymously published her first philosophical work, the Defence of Mr. Locke’s Essay of Human Understanding, written in response to the three anonymous Remarks upon an Essay Concerning Humane Understanding, published in London between 1697 and 1699, which had been challenging John Locke’s epistemology and moral philosophy. The worth of her Defence was recognized by prominent philosophers of the time, including John Toland, Gottfried W. Leibniz, and John Locke himself. Locke was so impressed by “the strength and clearness” of Cockburn’s reasoning that once he came to know the authorship of the Defence, he sent her a sincere letter of appreciation and a number of books as gifts.

In 1707 she converted to Anglicanism from the Church of Rome, and, on that occasion she published A Discourse Concerning a Guide in Controversies, in Two Letters: Written to One of the Church of Rome, by a Person Lately Converted from that Communion, explaining the main reasons for her choice. One year later Catharine married Patrick Cockburn, a clergyman, and they had three daughters: Mary, Catherine, and Grissel; and a son, John. From 1714 to 1726, their family experienced serious financial difficulties because of Patrick’s refusal to take the oath of abjuration against the pretender James Stuart. In this period Catharine devoted herself to her family and was totally diverted from her studies in philosophy. After Patrick eventually took the oath in 1726, he was appointed to the episcopal congregation of Aberdeen and their family’s condition rapidly improved. She then had the opportunity to pursue her intellectual interests, and later that same year, she wrote A Vindication of Mr. Locke’s Christian Principles, from the Injurious Imputations of Dr. Holdsworth. However, this essay remained unpublished until its inclusion in the 1751 edition of her Works.

In 1737, the Cockburns moved to Longhorsley, the last destination of Patrick’s career, and here they spent the final part of their lives. This was the most intense and prolific period of Catharine Cockburn’s life as a philosopher.

In 1739 she wrote her Remarks upon Some Writers in the Controversy Concerning the Foundation of Moral Duty and Moral Obligation, which was published in The History of the Works of the Learned in 1743. In 1747 she also published The Principles and Reasonings of Dr. Rutherforth’s Essay on the Nature and Obligation of Virtue, criticizing Thomas Rutherforth’s moral philosophy. Both works were written in defence of Samuel Clarke.

Catharine Cockburn also discussed her philosophical and religious positions in her correspondence with several people, especially Thomas Burnet of Kemnay; her son, John; her niece, Anne Arbuthnot; Thomas Sharp; and Edmund Law.

She was aware of the bias against women’s intellectual skills, and she lucidly resolved to publish all her philosophical writings anonymously for the sake of truth only. As she explained to Thomas Burnet of Kemnay, “a woman’s name would give a prejudice against a work of this nature; and truth and reason have less force, when the person, who defends them, is prejudged against” (Cockburn 1751, II: 155). Interestingly, towards the end of her life, Cockburn entrusted Thomas Birch with publishing a two-volume collection of her works. By that time she probably felt ready to stop hiding. She died in May 1749, only a few weeks after her husband’s death, and her Works was published posthumously in 1751.

2. Moral Philosophy

a. The True Grounds of Morality

Due to the style and structure of her philosophical writings, Catharine Cockburn’s thought is not presented systematically. In fact, since all her works were written in defence of someone else (either John Locke or Samuel Clarke), she was compelled to follow the reasoning of her adversaries. While she addressed a number of philosophical issues, such as the nature and the immortality of the soul, thinking matter, the nature of substances, and the origin of evil, giving original and sophisticated contributions on such subjects, moral philosophy was still her primary concern.

Cockburn’s views on morality, which take form throughout her works, is a combination of Locke’s principles of knowledge and Clarke’s moral fitness theory, and also includes elements from Cambridge Platonism and moral sense theory. She entertains an anthropocentric view on morality, defending the idea that human beings are naturally rational and social creatures. Accordingly, she argues that the true ground of morality is to be found in neither eternal moral truths nor in God’s command but consists in human nature itself.

In the first of her philosophical works, the Defence of Mr. Locke’s Essay (1702), Cockburn replies to support John Locke against Remarks upon an Essay Concerning Humane Understanding, probably written by Thomas Burnet (1635-1715) between 1697 and 1699—although Burnet’s authorship has been recently questioned (Walmsley, Craig, and Burrows 2016). She summarizes the Remarker’s objections in three main points: the doctrine of natural conscience, which he opposes to Locke’s anti-innatism; his accusation of voluntarism against Locke; and his worries about the possibility of a material soul and thinking matter.

Adhering to Locke’s epistemology, Cockburn argues that we cannot have any idea not derived from sensation and reflection, and as a consequence we can find the true grounds of morality by following Locke’s principles of knowledge. As such, she believes that good and evil are not absolute principles imprinted in our minds by God from the beginning; instead, they are ideas formed in us by pleasure and pain. Contrary to this idea, the Remarker denies that Locke’s epistemology could provide “a sure foundation for morality” (Burnet (?) 1697a, 4) and instead holds that human beings are endowed with a “natural conscience.” This is a “natural sagacity” or an “instinct” which operates within us as “a principle of action” and directs our behaviour prior to reason (Burnet (?) 1699, 7-8). Rebuffing the Remarker’s objections, Catharine Cockburn maintains that no morality is possible independently of ratiocination since moral virtues such as justice, fidelity, and gratitude would be empty notions if taken with no relation to human beings. She points out that although Locke refused metaphysical or moral truths originally imprinted in mind, he never denied the existence of a power of perceiving in the soul and distinguishing between good and evil. Simply, she argues that even if this power is so immediate that it seems to prevent any ratiocination, it is actually an effect of ratiocination itself. This power in the soul is what Cockburn calls “conscience,” which does not consist in an inward moral sense as argued by the author of the Remarks, but instead comes from sensation and reflection and is set to work through man’s first persuasions and confirmed by his habits. Conscience can be very useful in morality when one is rightly educated, but it can also be misleading when corrupted by vicious customs. Thus, Cockburn concludes that conscience, far from proving innate moral principles, must neither be taken for a moral law nor for the true foundation of morality.

Furthermore, in Cockburn’s moral philosophy, the grounds of morality do not rest in the original and absolute moral principles in God’s mind. More precisely, she does not deny the reality of such principles but instead argues that the perfect being and its moral attributes of goodness and justice are infinitely beyond our narrow capacities. Human beings can have an idea of God and his attributes only by reflecting upon themselves: “for whatever is the original standard of good and evil, it is plain, we have no notion of them but by their conformity, or repugnancy to our reason, and with relation to our nature” (I: 57-58). Interestingly, according to Cockburn we first have a notion of good, and then we know that God himself is good. Therefore, the nature of God neither provides sure foundation for morality, nor can be the rule of good and evil.

Instead, Cockburn adopts an anthropocentric view according to which the nature of man and the good of society are to us the reason and rule of moral good and evil. Rejecting the Remarker’s accusation of moral relativism against Locke’s anti-innatism, she particularly emphasizes that since reason and sociability are essential to human nature, they are the true and immutable grounds of morality. In fact, God has fitted everything to its proper end—which is happiness for Mankind—and accordingly he requires those things of us to which he has suited our nature. On this point, Cockburn explicitly refers to Grotius’ view that the law of nature is the product of human nature itself and hence she draws the conclusion that “it must subsist as long as human nature” (I: 58). In other words, as long as human beings are human beings, they can infallibly know the difference between good and evil by the light of reason and accordingly they can act suitably to their sociable nature. It is worth noting that Cockburn deliberately refrains from engaging a metaphysical controversy with the Remarker on morality—and indeed, this is not the main concern of her Defence. She rather aims at finding the epistemological and ontological foundation of morality from a human perspective. As we will see below, her later works show a stronger commitment to metaphysical and theological aspects of morality.

The anonymous Remarker also charges Locke with voluntarism. Obviously he does not use this term, which was coined only in the nineteenth century to define a moral theory according to which will takes priority over intellect, and as applied to divine action, holds that morality originates from the will of God. Historically, this view was attributed to Augustine of Hippo, Duns Scotus, William of Ockham, and in the early modern age, to Thomas Hobbes and Robert Boyle. The opposite approach is usually called “intellectualism,” which states that intellect takes precedence over will, and moral standards eternally exist in God’s intellect, determining his will and command. This view is usually ascribed to Thomas Aquinas in the Middle Ages and Cambridge Platonists in the seventeenth century.

The Remarker accuses Locke of grounding morality in the arbitrary will of God, enforcing it by a system of rewards and punishments. From the Remarker’s point of view, this supposition has dangerous consequences for morality: he points out that if the will of God were the original rule of good and evil, without any rule determining his will, there would not be any rule of sin to God either, and God himself would be “the author of sin” (Burnet (?) 1697b, 22).

Rejecting this accusation, Cockburn explains that Locke’s notions of “will of God” and “punishments and rewards” could give morality the force of law but were not meant to be its true foundation. This is a central point in her moral philosophy, which clarifies the role of God’s will and command and at the same time reaffirms the importance of human reason in morality. She maintains that, as with the case of good and evil, “we can only know the will of God by its conformity to our nature” (I: 62), and therefore his command would not have any effectiveness if it were not “knowable to us by the light of nature” (I: 61). However, in her first philosophical work, Cockburn does not provide further details concerning moral obligation, limiting herself to the claim that God’s command is not the source of obligation and that human beings are obliged to do what he commands by their own reason. She further develops her notion of moral obligation in her mature works, especially in the Remarks upon Some Writers (1743) and in the Remarks upon the Principles and Reasonings of Dr. Rutherforth’s Essay (1747).

b. Moral Obligation

Like her Defence of Locke, Cockburn’s later philosophical works pursue an apologetic purpose: to defend Samuel Clarke from the attacks of a number of critics. Nevertheless, these writings show her evident intellectual autonomy. She was particularly inspired by Clarke’s doctrine of moral fitness, according to which an agreement or disagreement of some things with others necessarily arises from different relations among different things. Clarke argued that there is an eternal universal fitness of things that precedes and determines both the will of God and the will of his creatures. In fact, since God is self-existent, absolutely independent, and all-powerful, he always does what he knows to be fittest to be done, and he therefore acts always according to the strictest rules of infinite goodness, justice, truth, and all other moral perfections. Thus, according to Clarke, virtue consists in the conformity of actions to the fitness of things.

Although Cockburn advocated Clarke’s view on fitness, there are strong clues that she was not directly influenced by it. In fact, before Samuel Clarke introduced his fitness doctrine in the Boyle Lectures he gave in 1705, Cockburn had developed her own view by the time she wrote her Defence (1701/02): despite the difference in terminology, what is “suitable to human nature” for Cockburn seems to correspond to what is “fit” for Clarke (Bolton 1993, 575-586; Sheridan 2007, 147-148).

Cockburn’s Remarks upon Some Writers was mainly influenced by Edmund Law’s 1731 English translation of De Origine Mali by William King (1702). In commenting on Leibniz’s theory of the best of all possible worlds, Cockburn explains that God is perfectly free to choose which world to bring into actual existence, but although the creation of a particular system proceeds solely from a determination of the will of God, the relations and fitness of things in it are necessary, and his will must itself conform to that fitness. She entertains a partially intellectualist moral view, according to which God’s intellect seems to have a priority over his will, insofar as he perceives the eternity and inalterability of the necessary relations of all possible things. This is the reason why God could never want pain as suitable and pleasure as unsuitable for sensible beings, for it would be contrary to the system of relations of this world in which every living being aims at happiness.

In response to an accusation of inconsistency between this doctrine and the Lockean epistemological foundation of morality Cockburn presented in her Defence, a lengthy footnote was added to the 1751 edition of her Defence. The critic is unidentified, and it is still unclear whether the responding note was written by Cockburn herself or by Birch, especially because it refers to the author in the third person. However, it has been convincingly shown that it is quite faithful to Cockburn’s view (Bolton 1993, 570). In this footnote she explains that although the grounds of moral obligation have not been discussed in the Defence, she nonetheless explicitly rejects “the notion of founding morality on arbitrary will” and implicitly supposes “the nature of God, or the divine understanding, and the nature of man […] to be the true grounds of it” (I: 61). Interestingly, Cockburn here distinguishes between “real laws,” which “imply authority and sanctions,” and “the law of nature,” which “obliges us, not as dependent, but as reasonable beings” (I: 61). God himself, the Supreme Rational Being, “who is subject to no laws, and accountable to none,” is obliged to do always what is right and fit (I:, 61-62). She reaffirms that God’s command and will, and rewards and punishment, are necessary to morality as they “only give it the force of a law,” but they are not the source of obligation (I: 61-62).

Cockburn’s view on obligation has been recently seen as a mark of her independence and originality. In fact, it seems to be something different from Locke’s view that moral obligation is grounded in a superior decree (Sheridan 2007, 145-46). However, it is worth noting that Locke mainly expressed this position in his Essays on the Laws of Nature, which remained unpublished until 1954, and it is unlikely that Cockburn read it. Nevertheless, her position undoubtedly differs from Locke’s.

In her Remarks upon Some Writers, Cockburn claims that all human beings have a moral sense, which operates in them before any sort of revelation. However, she explains that this moral sense, contrary to the thought of Scottish Enlightenment thinker Frances Hutcheson (1694-1746), is not an innate, blind instinct, but “a consciousness consequent upon the perceptions of the rational mind” (I: 407), and it can be cultivated and improved by the right use of our abilities. Although she allows that the faculty that distinguishes between right and wrong is probably innate “since it operates in some measure on all mankind” (I: 407), its exercise depends on custom and education. Such a moral sense also acknowledges that virtue consists in the law of human nature, and it accordingly approves virtuous actions and disapproves the contrary. Thus, the obligation that human moral sense perceives as a duty arises from the eternal fitness of things and does not depend on the will of God and the sanctions of his laws, but can only be enforced by them. In fact, Cockburn argues that since mankind is a system of creatures that continually need one another’s assistance, it is necessary that everyone contributes to the good and preservation of society according to her/his capacity. To this purpose, human beings are so far pushed towards virtue by their moral sense that all of them naturally feel the moral “obligation of living suitably to a rational and social nature” (I: 413). For Cockburn, it is plain that as a rational being should act suitably to reason and the nature of things, so a social being should promote the good of others: these ends are suitable to the nature of rational and social beings, and the contrary would be as absurd as preferring pain to pleasure.

Cockburn further explains this point in her Remarks upon the Principles and Reasonings of Dr. Rutherforth: as human beings, we are naturally inclined towards happiness by our self-love, which is “increased by our practice of moral good” and in turn “naturally incline us to continue in that practice” (II: 20). Thomas Rutherforth (1712-1771) notes that if the desire of our happiness (that is, our own interest) expands in proportion to our practice of virtue, it follows that the more we are virtuous, the more we grow selfish, and paradoxically, the practice of virtue will be “fatal to itself, by strengthening that self-love” (Rutherforth 1744, 65). Cockburn objects to Rutherforth that although a vicious misapplication of self-love is actually dangerous to virtue—for instance if it is applied solely to private interest and self alone—true self-love is not the same as selfishness. Instead, it is a disinterested benevolence which involves the happiness of others. Thus our virtue, by strengthening our self-love, is in return strengthened by it. An “undeniable instance” of such a disinterested benevolence is provided by the “natural affection of parents for their children” (II: 20).

However, human beings are imperfect creatures, and when exposed to irregular passions, they can deviate from the rule of their duty. Thus God, who foresees everything, decided to link their natural duty to his own will by declaring that he would eternally reward obedience or punish disobedience. But in doing so, he gave Men only a new motive to the performance of their duty but no new foundation for it. To summarize, although Cockburn allows that eternal moral truths, Revelation, God’s command, and his will all play an important role in morality, she argues that they do not provide a sure and true foundation, since it is only in their conformity to rational and sociable human nature that they are moral motives for human beings. Cockburn adopts a strongly anthropocentric view of morality, which combine Locke’s principles of knowledge and Clarke’s metaphysical instances.

3. Religion

According to Thomas Birch, her official biographer, Catharine Cockburn was born in a Protestant family and she was therefore educated in the Anglican religion. Nevertheless, while she was very young, her intimacy with several unidentified Catholic families pushed her toward the Church of Rome, and she embraced that communion until 1707 when she converted back to the Church of England. Probably, her conversion was inspired by her long acquaintance with Gilbert and Elizabeth Burnet during her stay in Salisbury. Nonetheless, it seems to be a coherent consequence of her intellectual and philosophical trajectory.

Cockburn’s view on religion was neither rigid nor enthusiastic: she was not a fierce follower of her communion, and at the same time, she was allergic to any blind faith in dogmas. On the contrary, she believed that the best religion was “the knowledge and practice of our duty” in agreement to God’s revelation (II: 157). She explains that since happiness is for human beings the primary and necessary motive of all their actions, and it consists in living suitably to their rational and sociable nature, it follows that “our duty” consists exactly in living suitably to our nature. Now, a true religion must necessarily aim at guiding men in the correct practice of their duty, and it must therefore be both reasonable and committed to politics. As regards reasonableness, Cockburn rejects the Remarker’s position that religion would be “better established on the nature of God” (I: 59), arguing that the nature and will of God can be seen as a strong foundation of religion only insofar as they conform to human reason. As it concerns politics, Cockburn maintains that since men have a natural inclination toward other human beings and their happiness, a true religion must take care of the good of government and society. As a matter of fact, she concludes that if a religion should be unpolitic and destructive to society, it would necessarily be false, since “nothing can be a law to nature, which of direct consequence would destroy nature” (I: 59).

It is worth noting that in Cockburn’s antidogmatic view on religion, reasonableness and sociability are assumed as indispensable criteria of truth, and consequently there is not only one possible true religion, but any communion that satisfies those criteria would be true. She believed that Christianity should be grounded on a single necessary article of faith, namely the divine nature of Jesus Christ. Thus, all distinctions among churches did not concern necessary elements to salvation but depended only on formal aspects of the worship: simply, she explains that in reading dark passages in the Holy Scriptures, men give different interpretations of unessential articles of faith. However, these interpretations have too often been defended with excessive zeal, which had made “the terms of communion straighter than God has made the terms of salvation” (I: 14), therefore causing massacres and persecutions among Christians. Cockburn ironically notes that “those who are most bigoted to a sect, or most rigid and precise in their forms and outward discipline, are most negligent of the moral duties, which certainly are the main end of religion” (II: 177). On the contrary, she believed that during the Reformation, there was “rather a separation of than from the church” (II: 135), and none of the resulting communions had the absolute authority to direct our faith. Otherwise, the Scripture would have given us incontestable directions to find the true faith. Accordingly, she argues that since there is no church in the world that is infallible and absolutely perfect in all points, everyone should follow that church that she/he is satisfied with, even if some of the unessential points in it seem unconvincing, unless they are proven to be dangerous for salvation. Moreover, she believed that such a choice was not irreversible, for all human beings have the right to read directly the Holy Scriptures, and they should also “have the liberty of judging for themselves” (I: 24) whether or not their church acts in agreement with the words of God. Similarly, she found unacceptable the pretence of infallibility of the Church of Rome, because it was not confirmed by textual evidence. Thus, she eventually decided to go back to the Church of England.

4. Metaphysical Issues

In her philosophical writings, Catharine Cockburn also deals with a variety of metaphysical issues, some of which are closely connected with her account of morality. Among others she was particularly concerned with the following two: (a) the nature of the soul and the related themes of its immateriality, immortality, and the possibility of thinking matter; and b() the ontological reality of space.

a. The Nature of the Soul

In his Essay Concerning Human Understanding, John Locke argues against Descartes that the cogitative activity of the soul is not continuous, and “that the soul always thinks” is not a self-evident proposition and needs proof and the support of experience. However, experience itself clearly shows that the soul is sometimes absolutely without thought, for example, in a deep and dreamless sleep. According to Locke, thought is to the soul what motion is to the body, that is, one of its operations—maybe the most peculiar—but not its essence. Thus, as the body does not always move, so the soul does not necessarily always think. Locke also emphasizes that “our faculties cannot arrive at demonstrative certainty” of the immateriality of the soul, although it is highly probable. However, the highest degree of probability does not exclude the possibility of thinking matter, for “God may, if he pleases, give, or have given to some systems of matter a power to conceive and think” (Locke 1824, IV.3: §6).

The author of the Remarks upon an Essay Concerning Humane Understanding expresses serious worries over Locke’s view, insinuating that it could endanger the immortality of the soul, fostering materialism and atheism. Catharine Cockburn carefully examines the Remarker’s objections and replies point by point.

Firstly, she claims that since the supposition that “the soul always thinks” does not prove that it is immortal, the contrary supposition does not take away any proof of its immortality. Her reasoning proceeds as follows:

(1)   “The soul always thinks.” (a) is not a necessary truth, as Locke had shown;

(2)   If (1), then the contrary proposition, that is, “the soul does not always think” (b), is at least possible;

(3)   Even if thinking were necessary for a soul to exist now—this is far from being demonstrated—this would neither prove that that soul has always existed, nor that it will always exist;

(4)   From (3), it follows that (a) does not provide sufficient evidence for the immortality of the soul;

(5)   From (2) and (4), it follows that (b) cannot be necessarily taken as an argument against the immortality of the soul.

Therefore, she concludes—rebuffing the objections by the Remarker—that Locke’s hypothesis that men do not think in sound sleep does not weaken the Christian doctrine of immortality.

Secondly, the Remarker is particularly afraid that if all our thoughts be extinct in sound sleep, the soul itself would be extinct as well, and we would have a new soul every morning, or in other words, we would be new men every day. According to the Remarker, this is extremely dangerous for the doctrine of Resurrection; for how could we be the same persons on Judgement Day if we are different men every day?

Cockburn insightfully tackles her adversary’s difficulty by stressing that as a body continues with its existence when any motion ceases, and it is always the same body when a new motion is produced, so the soul exists even during an unthinking sleep, and it is the same soul when it wakes up. She imputes to the Remarker a loose use of language, especially when he takes soul, man, and person to signify the same thing, ignoring that for Locke these terms have different meanings: man is understood as the union of soul and body, and person as self-consciousness. According to Locke, consciousness only makes the same person: in fact, despite all changes that a man’s body can suffer throughout his life, he continues to recognize himself as himself, inasmuch as he has consciousness of his past actions and thoughts. As his consciousness extends backwards, so his personal identity reaches. Cockburn was sure that this was sufficient to prove that Locke’s view on identity was consistent with Christian Revelation and that it did not imply any sort of Deism as the Remarker had insinuated.

Interestingly, in defending Locke’s position, she points out that personal identity consists “in the same consciousness, and not in the same substance: in fact, whatever substance there is, without consciousness there is no person,” and “wherever there are two distinct incommunicable consciousnesses, there are two distinct persons, though in the same substance” (I:, 73). It is not clear whether Cockburn’s interpretation was entirely faithful to Locke’s view on identity, and in fact, commentators still disagree as to whether Locke entertained a substance-based or a mode-based theory of person. This is a long debate which can be traced back to Edmund Law (1703-1787), who was the first proponent of a mode reading of Locke’s view in his Defence of Mr. Locke’s Opinion Concerning Personal Identity (1769). However, although Cockburn could not enter into such a controversy since it was not yet in place when she died, it is evident that she gave a mode interpretation of Locke’s theory of personal identity over sixty years before Law (Gordon-Roth 2015, 71-72).

Thirdly, regarding the Remarker’s concerns with thinking matter, Cockburn notes that we have only an idea of the nature of the soul formed upon its operations, but we ignore whether the soul has essential properties distinct from matter, whereby it alone has the power of thinking. By echoing Locke’s agnosticism regarding substantial dualism, Cockburn emphasizes that we do not know whether there is an ontological and substantial difference between thinking and unthinking beings, and consequently, we ignore whether the substratum, which supports thought, is material or immaterial. Furthermore, she shows that the Remarker’s strategy of considering the immateriality of the soul as the main proof of its immortality has dangerous consequences for morality. In fact, she observes that human beings generally lack either leisure or capacity for metaphysical speculations, and if they believe that soul is immortal, it is irrelevant whether they consider it immaterial or not. However, if we rest the proof of the immortality of the soul on its immateriality, it would be sufficient to weaken the proofs of soul’s immateriality in order to debunk our belief in its immortality (Gordon-Roth 2015, 67-69).

Despite her trenchant criticism against the Remarker, it is evident that Cockburn did not necessarily disagree with him on the immateriality of the soul—and indeed, she never argued that the soul is corporeal. She adopted an astute three-point strategy: first, she emphasized that any claim about the nature of the soul must be demonstrated, since it is beyond the limit of our understanding; second, she showed that the hypothesis of the soul’s immateriality can have dangerous implications for morality; and third, she pushed the burden of proof to her adversary, who had to prove why his view was preferable to Locke’s (Thomas 2015, 258; Gordon-Roth 2015, 70).

b. The Reality of Space

In her Remarks upon Some Writers (1743), Catharine Cockburn examines, among other things, some of Edmund Law’s objections against Samuel Clarke concerning the nature of space.

Clarke argued that space necessarily exists because it is an entity that contains things and matter. Although it is not sensible, it cannot be nothing, since space has properties, while “nothing” does not: space has quantity and dimension; it is infinite, immutable, continuous, uncreated, and eternal. Nevertheless, Clarke had also defined space and time as divine properties or modes, which are not independent beings but depend on the only self-subsisting being, namely God.

Law objected that space is only an abstract idea in our mind that is formed by perceiving extended substances and abstracting from them the idea of space. He also rejected Clarke’s position that space is a real being because it has the property of containing bodies, since this makes no more sense than saying that darkness has qualities because it has the property of receiving light. According to Law, space is nothing or just an absence of extended bodies, and for this reason it does not have any properties.

Against Law’s view, Cockburn affirms the real existence of space. She argues that while “extension” is an abstract idea that can be predicated of both space and matter, “space” is not. Actually, space, matter, and extension are strictly connected notions, but space consists neither in matter nor in extension. Rather, she believed that the idea of space is “early obtruded upon the mind by senses, and unavoidable perceived by it” (I: 389), and accordingly, it precedes the idea of extension and does not depend upon our capability of abstracting. Moreover, we could neither conceive the real existence of bodies nor their motion without the idea of space, where they exist and move. Thus, we should admit or reject them altogether.

In the same writing, Cockburn also considers Isaac Watts’ view on space, according to which we do not know what class of beings space should be placed into. Watts had argued that space cannot be a mode of being (because its idea subsists independently of the existence of other beings), but it is not a substance neither (because it is neither material nor spiritual), and it cannot be God (because while space is measurable, God is not at all). Thus, Watts concludes that space “must be nothing” (Watts 1733, 19). Cockburn objects that Cartesian substantial dualism—manifestly adopted by Watts—does not provide a necessarily adequate division of being, and she conversely observes “that there may be other substances than either spirits or bodies” (I: 390). Explicitly embracing the doctrine of the “great chain of beings” (I: 391), Cockburn holds that there is a gradual progression in the ontological structure of nature, by which the most imperfect beings are connected to those that are close to perfection. Since this hierarchical organization of beings must be full and continuous, “there should be in nature some being to fill up the vast chasm betwixt body and spirit, otherwise the gradation would fail, and the chain would seem to be broken. […] And why may not space be such a being” (I: 391)? Thus, she concludes that we cannot define space as nothing just because we do not know what it is. Otherwise we should come to the same conclusion about unextended substances whose reality we have no idea of.

Finally, assuming space to be a real being, Cockburn was inclined not to ascribe infinity to space. She considers two kinds of infinity: a positive infinity and a negative one. Positive infinity, as described by Clarke, is understood as a metaphysical infinity, namely an absolute perfection, to which nothing can be added. In this sense, infinite space can be identified with an attribute of God. Negative infinity—as Locke had explained—is something to which more can be endlessly added. Cockburn notes that such a notion of infinity can only be applied to general abstract ideas such as number, duration, and extension, but “it should not be ascribed to space by those who allow space to be a real particular being” (I: 401).

Cockburn’s substantival account of space has been recently seen as a further proof of her intellectual autonomy and philosophical originality: it offers a credible third way between Descartes’ view that space is a substance but with no divine properties, and Newton’s position that space has many properties, including all those usually attributed to God, but it is not a substance (Thomas 2013).

5. Originality

In the 21st century, Catharine Cockburn’s acumen has been recognized by commentators, and an increasing quantity of literature shows that she had original philosophical positions, although some scholars do not consider it fully new (Nuovo 2011, 248-249).

As noted above, all her philosophical works were written in defense of either Locke or Clarke, and she was consequently forced in turn to follow the line of reasoning of their critics. However, her writings and private correspondence show that her intention was not merely to vindicate those eminent philosophers but rather to enter into the most lively controversies of that time and contribute to them.

In Cockburn’s philosophy, we have found at least four marks of originality and intellectual autonomy:

First, we have seen that she was selective about which ideas of Locke’s and Clarke’s to defend. Particularly, although her idea of fitness aligns with Clarke’s, there are strong reasons to believe that she had developed her own view independently (Bolton 1993).

Second, we have considered that her view of moral obligation was grounded in human reason and sociability, showing that it clearly differs from Locke’s view that obligation is constituted in superior decree (Sheridan 2007).

Third, Cockburn anticipated a strong debate concerning the interpretation of Locke’s theory of personal identity, proposing a mode reading over sixty years before Edmund Law, who has been generally seen as the first proponent of this interpretation (Gordon-Roth 2015).

Fourth, we have examined her hypothesis of the ontological reality of space, according to which space is a substance and has divine properties. In this doctrine, some commentators have seen an original alternative both to Descartes’ dualistic view and Newton’s non-substantial theory of space (Thomas 2013).

6. References and Further Reading

a. Primary Sources

  • Burnet, Thomas (?). 1697a. Remarks upon an Essay Concerning Humane Understanding: In a Letter Addres’d to the Author. London: Wotton.
  • Burnet, Thomas (?). 1697b. Second Remarks upon an Essay Concerning Humane Understanding: In a Letter Addres’d to the Author. London: Wotton.
  • Burnet, Thomas (?). 1699. Third Remarks upon an Essay Concerning Humane Understanding: In a Letter Addres’d to the Author. London: Wotton.
  • Clarke, Samuel. 1998. A Demonstration of the Being and Attributes of God. Edited by Ezio Vailati. Cambridge: Cambridge University Press.
  • Cockburn, Catharine (née Trotter). 1751. The Works of Mrs. Cockburn, Theological, Moral, Dramatic, and Poetical, Several of Them Now First Printed, Revised and Published with an Account of the Life of the Author by Thomas Birch. 2 vols., London: J. and P. Knapton.
  • King, William. 1731. An Essay on the Origin of Evil. Edited and Translated by Edmund Law. London: Thurlbourn.
  • Locke, John. 1824. “An Essay Concerning Human Understanding.” 4th edition. In The Works of John Locke in Nine Volumes, edited by John Locke. London: Rivington.
  • Rutherforth, Thomas. 1744. An Essay on the Nature and Obligations of Virtue. London: Thurlbourn.
  • Watts, Isaac. 1733 Philosophical Essays on Various Subjects, 2nd edition. London: R. Ford.

b. Secondary Sources

  • Bolton, Martha Brandt. 1993. “Some Aspects of the Philosophy of Catharine Trotter.” Journal of the History of Philosophy 31, no. 4: 565-88.
    • A classic in Cockburn scholarship, this is one of the first papers that consider Catharine Trotter Cockburn as an original and independent philosopher.
  • Broad, Jacqueline. 2002. Women Philosophers of the Seventeenth Century. Cambridge: Cambridge University Press.
    • A detailed analysis of the contribution of women to philosophy in early modern England. Broad explores the philosophical writings of five figures, including Margaret Cavendish, Anne Conway, Mary Astell, and Catharine Trotter Cockburn.
  • Connor, Margaret. 1995. “Catharine Trotter: An Unknown Child.” American Notes and Queries. Quarterly Journal of Short Articles 8, no. 4: 11-14.
    • This brief paper unconvincingly questions Cockburn’s birthdate.
  • De Tommaso, Emilio M. 2017a. “Il razionalismo etico di Catharine Trotter Cockburn.” Intersezioni XXXVII-1: 19-38.
    • A study of Cockburn’s moral philosophy presented as a sort of ethical rationalism.
  • De Tommaso, Emilio M. 2017b. “‘Some Reflections upon the True Grounds of Morality’—Catharine Trotter in Defence of John Locke.” Philosophy Study 7, no. 6: 326-339.
    • An analysis of Cockburn’s main arguments in favor of the compatibility between morality and Locke’s epistemology.
  • Duran, Jane. 2013. “Early English Empiricism and the Work of Catharine Trotter Cockburn.” Metaphilosophy 44: 485-94.
    • An examination of the empiricist legacy of Cockburn’s philosophy.
  • Gordon-Roth, Jessica. 2015. “Catharine Trotter Cockburn’s Defence of Locke.” The Monist 98: 64-76.
    • An excellent examination of some metaphysical issues in Cockburn’s Defence of Locke, including the immateriality and immortality of the soul.
  • Hutton, Sarah. 1998. “Cockburn, Catharine (1679-1749).” In Routledge Encyclopedia of Philosophy, edited by Edward Craig. Routledge: London. doi: 10.4324/9780415249126-DA017-1
  • Kelley, Anne. 2001. “‘In Search of Truths Sublime’: Reason and the Body in the Writings of Catharine Trotter.” Women’s Writing 8, no. 2: 235-50.
    • Argues that Cockburn’s project was to challenge the convention of women’s intellectual and moral inferiority, demanding their right to a public voice.
  • Kelley, Anne. 2002. Catharine Trotter an Early Modern Writer in the Vanguard of Feminism. Aldershot: Ashgate.
    • A milestone in Cockburn scholarship covering both her literary and philosophical works.
  • Linker, Laura. 2010. “Catharine Trotter and the Humane Libertine.” Studies in English Literature 1500-1900 50, no. 3: 583-99.
    • An insightful examination of some libertine resonances in Cockburn’s comedy Love at Loss. The paper also focuses on women’s lack of power, especially after marriage.
  • Myers, Joanne E. 2012. “Catharine Trotter and the Claims of Conscience.” Tulsa Studies in Women’s Literature 31, no. 1/2: 53-75.
    • A comprehensive analysis of the role of religious themes in Cockburn’s writing.
  • Nuovo, Victor. 2011. Christianity, Antiquity, and Enlightenment: Interpretations of Locke. New York: Springer.
    • In this detailed collection of essays on the Christian philosophy of John Locke, Victor Nuovo devotes a chapter to Catharine Cockburn’s enlightenment.
  • O’Neill, Eileen. 2005. “Early Modern Women Philosophers and the History of Philosophy.” Hypatia 20, no. 3: 185-97.
    • This is an in-depth analysis of the reasons why early modern women philosophers disappeared from the history of philosophy by the twentieth century.
  • Ready, Kathryn J. 2002. “Damaris Cudworth Masham, Catharine Trotter Cockburn, and the Feminist Legacy of Locke’s Theory of Personal Identity.” Eighteenth-Century Studies 35, no. 4: 563-76.
    • This paper emphasizes the feminist implications of both Masham’s and Cockburn’s interpretations of Locke’s view on personhood.
  • Sheridan, Patricia. 2007. “Reflection, Nature, and Moral Law: The Extent of Catharine Cockburn’s Lockeanism in her Defence of Mr. Locke’s .Hypatia 22, no. 3: 133-51.
    • A thorough examination of Cockburn’s Defence. The author provides convincing proof of Cockburn’s originality and independence from Locke.
  • Sheridan, Patricia. 2011. “Catharine Trotter Cockburn.” In the Stanford Encyclopedia of Philosophy. http://plato.stanford.edu.
  • Sund, Elizabeth. 2013. “The Right to Resist: Women’s Citizenship in Catharine Trotter Cockburn’s The Revolution of Sweden.” In Political Ideas of Enlightenment Women: Virtue and Citizenship, edited by L. Curtis-Wendlandt, P. Gibbard, K. Green, 141-156., New York: Ashgate.
    • This essay focuses on Cockburn’s view on citizenship, exploring political and feminist concerns in her final play, The Revolution of Sweden.
  • Thomas, Emily. 2013. “Catharine Cockburn on Substantival Space.” History of Philosophy Quarterly 30, no. 3: 195-214.
    • Provides evidence that Cockburn’s account of substantival space was new and original.
  • Thomas, Emily. 2015. “Catharine Cockburn on Unthinking Immaterial Substance: Souls, Space, and Related Matters.” Philosophy Compass 10, no. 4: 255-63.
    • A careful examination of metaphysical themes in Cockburn’s philosophical works.
  • Waithe, Mary Ellen. 1991. “Catharine Trotter Cockburn.” In A History of Women Philosophers. Vol. III: Modern Women Philosophers, 1600-1900, edited by M.E. Waithe, 101-125. Dordrecht: Kluwer Academic Publishers.
    • One of the first works that includes Cockburn in the history of philosophy. It explores her moral philosophy and some metaphysical issues as presented in her Defence of Mr. Locke’s Essay.
  • Walmsley, J.C., Hugh Craig, and John Burrows. 2016. “The Authorship of the Remarks upon an Essay Concerning Humane Understanding.” Eighteenth-Century Thought 6: 205-43.
    • Argues that attribution to Richard Willis is more probable than the traditional attribution to Thomas Burnet.
  • Williams, Jane. 1861. “Catharine Cockburn.” In The Literary Women of England, edited by J. Williams, 170-188. London: Saunders, Oatley and Co.
    • In her broad study of the literary women of England, Jane Williams devotes some pages to Catharine Cockburn.

 

Author Information

Emilio Maria De Tommaso
Email: emdetommaso@unical.it
University of Calabria
Italy

Properties

A stone, a bag of sugar and a guinea pig all weigh one kilogram. A lily, a cloud and a sample of copper sulphate are white. A statue, a dance and a mathematical equation are beautiful. The fact that distinct particular things can be the same as each other and yet different has been the source of a great deal of philosophical discussion, and in contemporary philosophy we would usually say that what makes distinct particulars qualitatively the same as each other is that they have properties in common. The stone, the sugar and the guinea pig all instantiate the property of weighing one kilogram, while the lily, the cloud and the copper sulphate all instantiate the property of being white. The distribution of properties determines qualitative sameness and difference.

At this point, the consensus ends and a variety of philosophical questions arise about the nature of properties and their relationship to other entities and each other. Is the category of properties a fundamental one, or is the existence of properties determined by the existence of something else? Are some properties more fundamental than others? What is the relationship between properties and causation, and causal laws? What is the relationship between properties and meaning? Do properties determine what could and what could not happen? Do they determine which natural kinds there are? Do properties exist independent of the mind?

Table of Contents

  1. What Are Properties? Ontological Questions
    1. The Ontological Basis of Properties
    2. Nominalism versus Realism
  2. The Identity and Individuation of Properties
    1. Extensional Criteria
    2. A Revised Extensional Criterion: The Modal Criterion
    3. Hyperintensional Criteria
    4. Dualism about Properties and Concepts
    5. The Causal Criterion
    6. Quiddities
  3. Which Properties Are There?
    1. Families of Properties
    2. Maximalism versus Minimalism
  4. Problems with Instantiation
    1. The Instantiation Regress
    2. The Paradox of Self-Instantiation
  5. Categorical and Dispositional Properties
    1. Do Dispositional Properties Depend upon Categorical Ones?
    2. Dispositional Properties from Categorical Ones
    3. Dispositional versus Categorical Properties
    4. Explanatory Uses for Dispositional Properties in Metaphysics: Laws and Modality
    5. Problems with Pan-Dispositionalism
  6. Properties and Natural Kinds
  7. Different Types of Properties
    1. Intrinsic and Extrinsic Properties
    2. Accidental and Essential Properties
    3. Monadic and Polyadic Properties
    4. Determinable and Determinate Properties
    5. Qualitative and Non-Qualitative Properties
    6. Technical Terms for Property Types
  8. Realism about Properties: Do Properties Exist?
  9. Properties in the History of Philosophy
    1. Ancient Theories of Properties
    2. Medieval Theories of Properties
    3. Properties and Enlightenment Science
  10. References and Further Reading

1. What Are Properties? Ontological Questions

a. The Ontological Basis of Properties

Properties are also known as ‘attributes’, ‘characteristics’, ‘features’, ‘types’ and ‘qualities’. The question of whether properties are a fundamental category of entities or whether qualitative similarity and difference is determined by the existence of something else has been a feature of philosophical debates since ancient times. (See Section 9.)

In contemporary philosophy, there are four main accounts of the ontological basis of such entities: universals, tropes, natural classes and resemblance classes. The alternative to any of these accounts is to treat properties as ungrounded entities which require neither further explanation nor ontological grounding. To see the difference between the different accounts of the ontological basis of properties, let us consider three instances of being white: the lily, the cloud and the sample of copper sulphate. The universals theorist maintains that each of these instances of white are instances of universal whiteness, an entity which is either transcendent, in that it exists whether or not it is ever instantiated, or immanent, in that it is wholly present in each of its instances. In the latter case, universals exist as part of the spatio-temporal world, whereas in the former they are abstract.

The trope theorist regards each instance of whiteness as an individual quality, not simply in the case of different types of white particulars such as the lily, the cloud and the copper sulphate, but also across particulars of the same type: the whiteness of each sample of copper sulphate is a distinct trope. Tropes are particular, unrepeatable entities, but this ontology of individual qualities must also have the resources to ground resemblance between tropes. The trope theorist wants to be able to say, for example, that the individual white tropes in a bunch of lilies resemble each other, but the nature of this resemblance is a matter of contention. Some theorists hold that trope similarity is primitive, a matter of unanalysable fact (Maurin 2002), while others maintain that tropes fall into resemblance classes or natural classes (Ehring 2011). Whatever the details of the formulation, it is crucial for a viable theory of properties that some such similarity between tropes obtains, because without it the ontology of tropes is one of bare particulars. In the latter case, the individual white tropes possessed by each lily would be no more similar nor different to each other than the red of the stoplight, the taste of the chocolate bar or the texture of the lizard, and that fails the very first demand of what we want a property theory to do. Similarity or resemblance between tropes is required alongside the mere existence of individual qualities themselves.

In the third and fourth accounts of qualitative similarity and difference, particulars are of the type they are by virtue of being members of sets of particulars: the lily, the cloud and the copper sulphate are all members of the set of white things, and it is in virtue of this that these particulars are white. If set membership is all that is required to be a property, then this view yields a super-abundant, over-populated ontology of properties: anything is a member of infinitely many sets with other things, but not all of these collections mark objective similarities. In order to deal with this over-population problem, the set-theoretic account of properties might add that some of this infinite collection of sets are more natural than others, making the account of properties one of natural classes of particulars (Lewis 1983a, 1986). The resemblance class theorist postulates a less abundant range of properties by maintaining that particulars belong to the classes they do because of primitive resemblance relations between them (Rodriguez-Pereyra 2002). Strictly speaking, however, although the natural and resemblance class theories give an account of qualitative similarity and difference, they may not all count as property theories; whether they do or not depends upon whether one opts to identify the classes of particulars with properties or not.

b. Nominalism versus Realism

A key factor which influences the decision about which ontological account of properties to accept is the question of whether general, repeatable or universal entities exist, or whether the entities which exist in the world are all particulars.

This debate is usually described as one between nominalism and realism, although care is needed here because these terms have other philosophical meanings as well. Within the discussion of properties, nominalism is taken to mean denying the existence of general or repeatable entities such as universals, in favour of an ontology of particulars; however, it is also used to mean ‘denying the existence of abstract objects’ as well. These positions are independent of each other and, in the case of property theories, it is possible to be a nominalist in the sense of denying the existence of abstract objects while accepting the existence of universals (and, conversely, to deny the existence of universals while accepting abstract objects as some resemblance nominalists do). For instance, David Armstrong’s account of properties as immanent universals is consistent with denying the existence of abstract objects while accepting the existence of repeatable, universal entities (Armstrong 1978a, 1978b). From now on, ‘nominalism’ is reserved for the denial that general, repeatable or universal entities exist.

Similarly, the term ‘realism’ is also ambiguous, this time within the study of properties: one might be a realist in the sense of being a realist about universals or repeatable entities; or, more broadly, one might be a realist about the existence of properties. This section considers realism in the former sense and postpones discussion about the existence of properties until Section 8.

In the context of theories of properties, we can distinguish realism, which accepts the existence of universals (either immanent or abstract) or which treats properties as a fundamental category of entities, from two versions of nominalism. The first, moderate nominalism accepts that individual qualities or properties exist in the form of tropes, while the view which is sometimes described as extreme nominalism denies the existence of any fine-grained qualities or property-like entities at all. The appearance of objective similarity and difference in nature must, for the extreme nominalist, be accounted for in terms of sets of concrete particulars (where set membership is not, on pain of circularity, determined by the properties which the particulars have) or in virtue of the particulars falling under a certain concept or a certain predicate applying to them. The former is known as set or class nominalism if no further account is given of why particulars belong to the classes which they do, although some sets may be considered to be more natural than others (see 3b); however, some proponents of this set-theoretic version of extreme nominalism maintain that particulars belong to the classes which they do in virtue of the particulars resembling each other (Rodriguez-Pereyra 2002).

Alternative versions of extreme nominalism refuse to give any reductive account of why distinct particulars are qualitatively similar to each other, dismissing this phenomenon (which gives rise to the debate between nominalists and realists in the first place) as not needing explanation. In this view, which is associated with Quine (1948), the One Over Many Problem is not a genuine philosophical problem: we can give an account of why ‘b is F’ and ‘c is F’ are true in terms of the particulars b and c existing and the predicate F applying to them. We do not require anything more than this semantic theory of predication, according to this version of extreme nominalism; and so not only do we not need to postulate universals, we do not need to postulate an alternative ontological category of particulars such as tropes, nor to give a reductive account of properties in terms of predicates or concepts of the kind which other extreme nominalists might support. This denial of the problem is disparagingly called ‘Ostrich Nominalism’ by Armstrong (1978a, 16) because of the ostrich’s habit of putting its head in the sand in the face of danger, but Quine’s view is defended from this charge by Devitt (1980). (See also Armstrong’s response to Devitt, 1980.)

The extreme nominalist position is usually motivated by suspicion about the ontological nature of universals since these must either be abstract objects, with the particulars which have them participating in or instantiating these abstract entities, or immanent universals which are wholly present at each instantiation. In both cases, one might be concerned that we do not have an account of the relationship between particulars and the universals which they instantiate: that is, what instantiation is. Moreover, if instantiation is itself a relation, its existence may lead to an infinite regress (see Section 4a). One might also be concerned about whether we can understand how immanent universals can be wholly present at many locations at once. In the apparent absence of strict criteria of identity or individuation for universals, which might shed light upon what being a universal amounts to, the extreme nominalist suggests that we should avoid ontological commitment to such entities on the grounds that they are ontologically mysterious (Devitt 1980).

On the other hand, the realist about universals complains that the extreme nominalist’s view is unexplanatory or that she has the direction of explanation the wrong way around. For instance, the extreme nominalist who accounts for qualitative similarity in terms of predicates (sometimes called a ‘predicate nominalist’) explains that distinct particulars are red because the predicate ‘is red’ applies to them; but, the realist urges, the more coherent explanation is that the predicate ‘is red’ applies to the particulars because each of the particulars has the property of being red. In short, it is more coherent to explain why predicates apply to particulars in terms of the properties which they have, rather than the other way around. The same criticism would apply to other forms of extreme nominalism which characterise qualitative similarity between particulars as being a matter of their belonging to the same set or their being subsumed under the same concept. According to Armstrong, the extreme nominalist is either ‘failing to answer a compulsory question in the examination paper’ (1978a, 17) by rejecting the One Over Many Problem, or is getting the answer to that question wrong.

The moderate nominalists, who attempt to occupy the middle position between the realists and extreme nominalists, accept that there is a fine-grained ontological category of qualitative entities, but they insist that these are particular qualities rather than general, repeatable or universal entities. The initial complaint from the realist about these moderate forms of nominalism, such as trope theory, is that if tropes are individual qualities with no relations of similarity or difference between them, then they are each as unlike each other as they are alike and so they fail to satisfy the primary desideratum of a theory of properties because we still have no account of what qualitative similarity is. However, it is crucial to note that this criticism is only effective against naïve accounts of trope theory. As was noted above, more sophisticated forms of trope theory remedy this difficulty by giving an account of similarity between tropes, either by postulating primitive resemblance relations between tropes or by postulating versions of class or resemblance nominalism where tropes are the members of natural or resemblance classes, rather than particulars. Thus, such trope theorists cannot be charged with failing to provide a coherent ontological basis for qualitative similarity.

Despite this, however, the dispute between realists and moderate nominalists lingers on, with the former claiming to have the simpler ontology in comparison with trope theory, and accusing the versions of trope theory which treat resemblance between tropes as primitive of accepting too much as unanalysable brute fact. The trope theorists counter by repeating their complaints about the mysteriousness of universals, and as yet there is no clear winner in this debate. Even Armstrong (1992), who was committed to grounding similarity in immanent universals, admits that

trope theory has comparable explanatory power to his favoured universals theory.

It would be easy to spend the remainder of this article evaluating these alternative accounts of the

ontological basis of properties and the respective benefits of realism or nominalism. However, since each of the theories covered by both realism and moderate nominalism provides a workable property theory which gives an account of qualitative similarity and difference, this project would be superfluous to current requirements. Moreover, although each of these views has its committed proponents, some philosophers have suggested that a principled decision between the options is one which cannot be made in isolation from other, broader philosophical commitments such as those concerning the nature of modality or the existence of abstract objects (Allen 2016), or, if not, then it is a choice which is not of great philosophical significance (Hirsch 1993). With these additional difficulties in mind, the question of whether nominalism or realism is preferable, and the more specific matter concerning which nominalist or realist theory is the best, will not be pursued further.

2. The Identity and Individuation of Properties

It is at least useful—or, some philosophers would argue, imperative (Frege 1884, Quine 1948)—for there to be an account of identity and individuation for each category of entities. If we do not have an account of what determines whether an entity E is exactly the same entity as a member F of the same ontological category as E, or what makes E and F distinct from each other, we do not have a clear conception of what kinds of entities E and F are. To put the point simply: what determines that E = F, or what individuates E from F? The identity and individuation criteria required are constitutive, rather than epistemic, so we need not know (nor even be able to know) whether one property is the same as another in every particular case; it is the question of what makes it the case that one property is the same as another which is at issue.

This requirement for identity and individuation criteria for each category is a general one in metaphysics—applying equally to other categories such as sets, objects and persons—but it is one which has proved problematic in the case of properties because it is a difficult requirement for the property theorist to satisfy. Thus, those who treat the provision of identity criteria as mandatory for a category of entities to be legitimate go as far as rejecting the objective existence of properties, qualities, attributes and such in favour of versions of nominalism which rely on predicates or sets of concrete individuals instead (see Section 1b).

a. Extensional Criteria

The initial problem is that properties cannot be identified by their spatio-temporal location alone (as we might do with particular objects) because many distinct properties can be co-located. Nor do properties satisfy extensional identity criteria like sets do; that is, a property cannot be identified by the set of individuals which instantiates it, at least if we just take actual individuals into account. Purely by accident, all individuals with a property P might also have property Q and so the set of all P individuals will be identical with the set of all Q individuals. If we accept a set-theoretic extensional account of property identity, then P = Q. For example, we can imagine a world in which everything which has the mass of exactly one gram is also a sphere, and that nothing else in that world is a sphere. In such a world, being a sphere = having mass 1g because the set of individuals which instantiates being a sphere is the same set as that which instantiates having mass 1g, since sets are identified by the elements they contain. But it is utterly counterintuitive to identify these properties: it seems possible that something which is not a sphere could have a mass of 1g, or that a sphere could have a mass other than 1g. This is known as the problem of accidental coextension.

With the obvious candidates rejected, the search for identity criteria for properties must look elsewhere. Part of the difficulty with how to proceed at this point arises because we need at least a rough picture of how many properties there are in order to ascertain whether a proposed criterion matches our intuitions about properties or not. The question of the number of properties which there are might, in turn, be affected by what one thinks that properties do: are properties causal entities, such as causes and effects, or entities which determine natural laws or regularities in nature? Are they semantic values; that is, do they determine what the predicates of our language mean? Or, are they something else besides?

Some of these options will be discussed below, but for now it is enough to note that the interconnections between these issues make it difficult to give a unique and plausible account of property identity in the abstract. Nevertheless, there are some viable candidates for such a criterion.

b. A Revised Extensional Criterion: The Modal Criterion

First, one could take seriously the intuition that the set-theoretic account of property identity, which was rejected above on the grounds of accidental coextension, might be acceptable if we considered all the possible individuals which instantiate a property, rather than just all the actual individuals which instantiate it. The problem with accidental coextension is that the same set of individuals happen to instantiate apparently distinct properties P and Q, although it seems plausible to think that an individual could exist which instantiated P without instantiating Q. But that problem will be alleviated if we include such possible individuals in the set in the first place. However, in order to do this, possible individuals must exist in the same sense as actual ones and so, following David Lewis, we must accept that modal realism is true (Lewis 1986). If we do, there is a constitutive, modal criterion of property identity based on the necessary coextension of identical properties; equivalently, for the modal realist, properties are identical if they are instantiated by the same set of possible and actual individuals.

One might object that Lewis’s modal criterion does not individuate properties finely enough, however. For instance, some distinct properties appear to be necessarily coextensive in his view: being a triangle and being a closed three-sided shape are instantiated by all the same actual and possible individuals but, one might argue, they are not the same property and so we do not want to identify them as Lewis’s criterion would do. At this point, the supporter of the modal criterion has a choice of two responses: first, he might deny the objector’s intuition that being a triangle and being a closed, three-sided shape are distinct properties. Or he might question the example in another way by arguing that such properties are not coextensive anyway, either because they are instantiated by distinct individuals or else because they are relations between different parts of the same individuals. Being a triangle and being a closed three-sided shape involve angles and sides respectively, regardless of whether broadly speaking they are instantiated by the same individual things (Rodriguez-Pereyra 2002, 100). However, a consequence of this move is that we cannot rely upon our intuitions about whether a property is monadic or polyadic (see 7c for more on this distinction).

Alternatively, if one decides to identify necessarily coextensive properties to preserve the modal criterion, there are also difficulties. First, it seems plausible that someone might have contradictory beliefs about a property: Sam believes that he has drawn a triangle, but Sam does not believe that he has drawn a closed three-sided shape. If we want properties to ground the distinction between these beliefs, or between propositional attitudes in general, then there will have to be a finer-grained distinction between properties. This matter is particularly pressing if one hopes for a property theory which helps to account for meaning or representation.

Secondly, the modal criterion identifies all indiscriminately necessary properties—properties which trivially apply to everything (see 7f)—since these too are necessarily coextensive. Properties such as being such that the number thirty-seven exists, being such that 2 + 2 = 4, and is dancing or not dancing apply to every possible individual and so all turn out to be identical with each other. One might regard this as an advantage on the basis that indiscriminately necessary properties are a dubious family of properties, although there do seem to be cases in which we are intuitively prone to distinguish them, such as when Sam believes that he is such that 2 + 2 = 4, but Sam does not believe that he is such that Fermat’s last theorem is true. If properties directly determine mental content, Sam cannot have both a true and a false belief about the same property.

c. Hyperintensional Criteria

In order to deal with these problems, we seem to require a finer-grained, hyperintensional criterion of property identity that can distinguish between properties which are necessarily coextensive. There is not much consensus about what the basis of such a criterion would be: one might think that properties are individuated linguistically or formally, so the property of being triangular and red would be distinct from being red and triangular. Perhaps this individuates properties too finely, at least for many of the roles we have presumed that properties play. Alternative hyperintensional accounts identify properties with objectively existing concepts (Bealer 1982) or with abstract objects (Zalta 1983, 1988). Alternatively, one might turn to the quiddistic criterion of property identity discussed below.

d. Dualism about Properties and Concepts

The main problems for the modal criterion seem to arise when we are trying to employ properties to give an account of mental representation, or to capture differences between someone’s psychological states. If this is the case, one might argue that we could supplement the ontology of properties—identified and individuated according the possible and actual individuals which instantiate them—with a finer-grained ontology of concepts or linguistic entities. Properties could be coarser grained, perhaps identified and individuated according to the modal criterion, while predicates or concepts could be employed in the explanation of psychological states. (Bealer 1982. See Nolan 2014 for criticism of this strategy.)

e. The Causal Criterion

An alternative, and potentially much more coarse-grained, account of property identity is proposed by Shoemaker (1980) who suggests that properties can be identified and individuated in virtue of their causal roles. Thus, property P is identical with property Q if and only if P and Q have all the same causes and effects. Such a criterion exploits the fact that properties are causally related to each other and, furthermore, many properties appear to enter into these causal relations essentially: having mass of 1kg is having whatever it is that requires 1N force to accelerate at 1m/s2 in a frictionless environment, and which will create 9 x 1016 Joules of energy when the 1kg mass is destroyed. Because the causal relations in question are usually general causal relations, versions of this criterion are sometimes characterised as identifying and individuating properties in terms of their nomological or nomic role: that is, the role which the respective properties play in laws of nature, whether causal or structural (Swoyer 1982; Kistler 2002). The causal and nomological role criteria are sometimes grouped together as structuralist accounts of property identity and individuation, since what is essential to a property is its relations to other properties (and perhaps also to other entities).

The utility of the causal criterion might be restricted, however: if any properties do not enter into causal relations—that is, if they are uncaused and also causally inert—the causal criterion will not apply to them. Also, properties which are epiphenomenal (if any exist) will also be omitted, unless these can be identified and individuated on the basis of their causes alone. Spatio-temporal properties and properties of abstract objects (if there are any) are particularly problematic in this regard. Given these problems, one might maintain that the ontology of properties is mixed, with some which are essentially causal properties and others which are not. If so, however, the causal criterion is not a general criterion of what makes properties the same as each other or different, and thus it does not illuminate what in general a property is. Nevertheless, as the causal conception of properties has become more popular, more research has been done to explain how properties which do not appear to be essentially causal are essentially causal after all (Mumford 2004; Bird 2017; Williams 2017).

At this point, it is worth noting a metaphysical distinction between two closely related views which are consistent with property structuralism: one can take the causal relations which a property enters into as its constitutive identity criteria, or one can take properties to have an essentially causal nature which then determines the respective relations which each property enters into. In the former view, the nature of a property is determined by the relations in which it stands, whereas in the latter, the nature of a property determines the relations in which it stands. If one cares about there being strict identity criteria for each category of entities (Quine 1948), then the former provides non-circular identity criteria for properties (on the assumption that the nature of the relations into which a property enters is not determined by the nature of the property), whereas the latter view does not. Rather, the latter view asserts that each property has or consists of an intrinsic causal (or nomological) nature which serves to identify and individuate it. Although this move will not satisfy those who require strict identity criteria, it is argued that assuming that properties have intrinsic, essentially causal natures can facilitate a rich and fruitful theory of causation, laws, modality and perhaps more, and thus that it is worth abandoning methodological scruples for metaphysical benefits. These theories are discussed in Section 5.

If either of these structuralist conceptions of properties is correct, then a property could not have different causes and effects from those it has, because the causal relations which it enters into are constitutive of its nature (or else its nature determines which causal relations it enters into). Each property has its causal or nomological role necessarily. (A property might have different causes and effects in different background conditions, or in conjunction with different properties, but that is different.) One argument given in favour of this conception of properties is how well it fits with our understanding of fundamental properties via the physical sciences: in keeping with the example at the beginning of this section, we can empirically determine what properties can do whereas it is not obvious that we have the same epistemic access to what their qualitative nature is (for exceptions, see the next section). It would be parsimonious, as well as convenient, to think that there is nothing more to being a property than its contribution to causal or nomological processes.

f. Quiddities

Against the structuralist conceptions of properties discussed in the previous section, one might be concerned that there is more to a property than its causal or nomological role; or, going further, that the nature of a property is only contingently related to the role it plays in causation or laws. If this is the case, the nomological role R played by a property P in the actual world could be played by Q in another possible situation; and furthermore, P (which has actual role R) could have nomological role S in another possible situation. Moreover, one might worry that the causal or nomological criteria try to characterise properties in terms of their relations to other things, rather than as they themselves are internally. For instance, Armstrong notes that ‘properties are self-contained things, keeping themselves to themselves, not pointing beyond themselves to further effects brought about in virtue of such properties’ (Armstrong 1997, 80). If one takes this view, then what are properties and how are they identified? One might suggest that each property has a unique intrinsic qualitative nature known as a quiddity.

Some philosophers have complained that quiddities are obscure entities, distinguished by brute, unanalysable qualitative differences between them. Moreover, they imply a primitive account of transworld identity for properties; that is to say that what makes an entity the same property in different situations is nothing to do with the nomological, causal or other theoretical role that it plays, but simply to do with it having or being the same quiddity (Black 2000). A property Q which makes things appear blue to the human eye in normal light in the actual world could make things taste of chocolate in another. What makes property Q be Q in that counterfactual situation is that it has the same quiddity. The primitive qualitative ‘this-ness’ which quiddities impart to properties makes them analogous to haecceities, whatever it is which makes a particular the particular which it is (over and above the properties it instantiates). (See Schaffer 2005 for some disanalogies between quidditism and haecceitism.)

The postulation of quiddities presents epistemic challenges which Lewis (2009) notes, since it is not clear how we are able to acquire knowledge about quiddities if any effect that they could have upon us is associated with a specific quiddity only contingently. Furthermore, one might recall the parsimony argument of the previous section, presented in favour of forms of property structuralism: science does not appear to require the postulation of quiddities and can deal with properties entirely in terms of their causal or nomological role. If we do not need to postulate quiddities, why bother?

The supporter of quiddities has at least three responses available here as well as another way of side-stepping the worst of the criticism without reconciling with the structuralist. The first response is the most direct, arguing that we do have epistemic access to the qualitative nature of properties in our conscious experience (Heil 2003, who does not support a quiddistic conception of properties but one in which properties are both essentially causal and qualitative). The main difficulties for this response is to maintain the analogy between qualia and quiddities, and to argue that our conscious experience is broad enough to support a general argument for the existence of quiddities of properties which do not appear to us in conscious experience.

Secondly, one might argue that although quiddities are obscure when considered to be distinct, or partially distinct, entities from the properties which they individuate, they are not so obscure when regarded as being the properties themselves (Locke 2012). This latter conception of properties does not treat them as having internal qualitative natures in virtue of which they are individuated but as being those natures; in this view, properties are individuated in a primitive way simply by being numerically either the same property or a different one. Although this alternative conception gets rid of quiddities, and so placates the proponent of the parsimony argument, it does not advance our understanding of the individuation of properties beyond there being primitive qualitative differences between them.

The third response could take the form of a tu quoque argument against the supporters of a structuralist conception of properties, since there are epistemic challenges for them too; even if we identify and individuate properties in virtue of their causal roles, it is not obvious that empirical investigation will permit us to determine which properties exist (Allen 2002). Finally, one could argue that we do not need to accept quidditism in order to treat the causal roles of properties as being contingent, since there could be counterparts of actual, world-bound properties which play a different nomological or causal role. (See Black 2000; Hawthorne 2001; and Schaffer 2005 (who does not recommend this position).)

3. Which Properties Are There?

a. Families of Properties

There are not only many different properties, but many different families of properties: moral properties, such as good and bad; mathematical ones, such as being prime or being a convergent series; aesthetic ones, such as being beautiful; psychological ones, such as believing in poltergeists or wanting a drink; properties from the social sciences; and properties from the physical sciences. Every subject area about which we can think or speak about has properties associated with it; and there are perhaps many more besides. This leads to questions about whether all these families of properties exist in the same sense as each other, and whether one family is dependent upon or determined by another. We might also consider how different properties within a family of properties are related. (For a selection of metaphysical distinctions between properties, see Sections 6 and 7.)

Some varieties of properties may be mind- or theory-independent—that is, they would exist whether or not humans (or other conscious beings) had ever existed to discover them—while others might be mind- or theory-dependent. The latter are classifications which depend for their existence at least partially upon the existence of conscious subjects to be the classifiers. One might, for example, consider physical or natural properties to exist mind-independently, and aesthetic properties to be mind-dependent. Another distinction between families of properties might come about due to differences in the entities which instantiate them. For instance, some properties such as mathematical ones might be instantiated by abstract objects, while others are possessed by spatio-temporal entities.

Despite the prima facie differences, one might think that these families of properties are related to one another. Perhaps one family of properties is entirely determined by the existence of another family. For instance, psychological, moral or ethical properties might be entirely determined by (broadly speaking) physical ones by a relation such as supervenience, realisation or grounding. Furthermore, while some accounts of supervenience relate facts rather than properties, properties still play a crucial role as constituents in facts or states of affairs. Mathematical properties might be thought to be determined by logical properties, but in that case the relation of determination is one of logical entailment rather than ontological priority. (See Frege and Russell.)

The question of which families of properties exist mind-independently and which do not, and whether interesting relations exist between families of properties, can be clarified only by examining specific features of the different subject areas associated with them, a much larger task than can be accomplished here. Furthermore, although it makes intuitive sense to divide properties into families such as the physical, the psychological and so on, further philosophical consideration reveals difficulties in clarifying such distinctions and making them philosophically rigorous while retaining an interesting account of the relationship between them. There is, for instance, not much philosophical substance to a distinction between physical properties and mental ones if these families can be defined only in opposition to each other.

Finally, one might be interested in whether some properties within a family are dependent upon others of the same family, making some individual properties more fundamental than others. For example, one might think that all ethical properties are determined by one or two fundamental ones—being good or being just, for instance—or one might maintain that mathematical properties are entirely determined by the properties of natural numbers. Again, it is the task of the different areas of philosophy concerned, such as Moral Philosophy or the Philosophy of Mathematics in these cases, to work out whether these dependencies are viable.

b. Maximalism versus Minimalism

The question of whether some properties are more fundamental than others, in the sense of their determining the existence of other properties, is also of more general metaphysical interest when we overlook the boundaries between different families of properties, since it is related to the question of how many properties there are. Does every possible property exist? Does every predicate pick out a property? Or are a few properties the ‘real’ or genuine ones, with the others which we appear to refer to either being ontologically determined by the genuine ones or being linguistic or conceptual entities?

The answers to these questions lie somewhere on a continuum between minimalism on the one hand, which maintains that a very sparse population of properties exists, to maximalism on the other, which asserts the existence of every possible property (and perhaps even some impossible ones). This contrast between the minimalist and maximalist ends of the continuum is also captured by two conceptions of properties as being sparse and abundant (Lewis 1983a). How we decide which point on this continuum is the most plausible depends in part upon the role we think that properties play in the world and also upon the identity conditions which we think properties have: that is, upon what makes one property the same as or different from another. Furthermore, it may turn out that there are different conceptions of properties in play, intended to fulfil different metaphysical roles, which may be able to coexist alongside each other. Thus, a dualist account of properties is also a possibility, or else one might find some way in which the sparse properties and the abundant ones are connected.

The minimalist maintains that the properties which exist are sparse or few in number, a set of properties which (may) determine the behaviour of the rest. From a physicalist standpoint, the properties of fundamental physics are the most promising candidates for being members of the minimal set of sparse properties: properties of quarks, such as charge and spin, as opposed to properties such as being made of angora, liking chocolate or being green. Some sparse properties may exist which we have yet to discover, and which we may never discover; their existence is in no way tied to our language use or what we have the ability to pick out. Although there are few sparse properties, this is a comparative claim: there may still be infinitely many of them if we consider determinate properties such as specific masses—such as having mass of 1.4 grams—to be more fundamental than the determinable property mass.

The maximalist, on the other hand, obeys a principle of plenitude with respect to which properties exist. At the extreme, every property which could exist does exist, although the range of properties which this principle permits depends upon how the ‘could’ in ‘could exist’ is understood. Perhaps one of the most abundant population of properties is postulated by Lewis (and quickly rejected for not being metaphysically useful), who regards qualitative similarity and difference to be determined by membership in sets of actual and possible individuals. In the least discriminating understanding of this account of properties, any set of actual or possible individuals counts as a property, making the collection of properties into a super-abundant transfinite collection which far outruns our ability to name them. But, as Lewis quickly notes, there are simply too many of these properties to be useful—‘If it’s distinctions we want, too much structure is no better than none’ (1983a, 346)—and so he abandons this extreme maximalism in favour of an account of properties which is discussed below.

One could also retain a broad range of possible properties in a different way to Lewis’s sets of possible and actual individuals, perhaps by accepting the existence of transcendent universals, including universals which exist even though they are never instantiated by any actual individual. Such entities might even range beyond the possible to include universals which can never be instantiated, or which could be instantiated only if the laws of logic were non-classical, such as universals corresponding to the properties of being a round square or being a true contradiction.

A prima facie less abundant form of maximalism considers properties to be the semantic values of predicates, thus entities which either determine the meaning of any actual predicate in a human language or determine any meaning which there is or could be. (Whether this second maximal account of properties is only prima facie less abundant than the previous suggestion or is genuinely less abundant depends upon the relationship between possibility and range of meanings, a question which will not be considered here. If the range of possible meanings turns out to be coextensive with the range of possibilities, there may be no difference between these options.)

Even if we restrict ourselves to actual languages, there are many predicates, and so if there are properties which correspond with each of them, we will have a very abundantly populated ontology. How finely grained such a maximalist ontology is depends upon how we distinguish one property from another (or, relatedly, one predicate from another). In this view, there are uncontroversially properties for being red and being not red. But one might wonder whether there is a distinction between being red and not being not red which can be determined only when we have a principle for individuating properties or predicates. If the criterion is syntactic, then the properties being red and not being not red are distinct, but if the criterion is semantic, ‘being red’ and ‘not being not red’ are intuitively predicates picking out the same entity.

One might attempt to hold an intermediate position between maximalism and minimalism. For example, one might argue that which properties exist are those which have explanatory utility, giving us a more abundant population of properties than the minimalist physicalist accepts and a more restricted one than that which maintains that there is a property to determine the meaning of every predicate. But on reflection it is not clear how different this view will turn out to be from the maximalist accounts based upon the semantic values of predicates; after all, predicates exist because we use them in explanatory sentences. One might need a more restrictive account of legitimate explanations in order to whittle the range of properties down.

One advantage of a liberal, maximalist account of properties is epistemic: if properties are based upon predicates of our language, or on the types which we employ in our explanations, then properties are easy to find. Being an aardvark, or being igneous rock, or having influenza, or being a chair are all properties to which we refer and there is no need to go looking for some more fundamental, ‘genuine’ or ‘real’ set of properties to ground the types into which we classify things in our everyday and scientific explanations. However, this epistemic advantage over minimalism may not persist once we move away from the properties we encounter in the natural and human world and consider how we know about the myriad uninstantiated properties which most maximalists endorse, or once we consider the properties which are not instantiated by spatio-temporal objects but by abstract ones. These cases are particularly problematic because, if a version of the causal theory of knowledge is true, it is not clear how we could know about the properties of abstract objects or about properties which are not instantiated in the actual world at all. At this point, maximalism loses the epistemic advantage, although it still promises a useful account of meaning based upon which properties exist.

Second, the maximalist’s ontology of properties has a pragmatic advantage: the maximalist has a greater range of properties at her disposal, whereas the minimalist may discover that a property or a family of properties for which we have predicates does not exist.

Third, the maximalist can explain predicate meaning directly: the properties which exist determine what our predicates mean.

But for the minimalist, these advantages do not mitigate what he regards as the vastly uneconomical, overpopulated ontology of properties which the maximalist endorses. The maximalist accepts properties such as being threatened by a dragon on a Sunday and being fourth placed in the Mushroom Cup on MarioKart in the guise of a gorilla. The former is a property which has never been instantiated, while the latter is one which is only instantiated in a world of computer games, motor races and gorillas. Are we to say that these properties have always existed? If we are not, then they must have come into existence at some point in the history of the universe, in virtue of a more minimal set of properties which forms the basis for all the rest. If we treat these original properties as fundamental, the minimalist argues, then parsimony will be restored.

In addition to rejecting higher-level properties which appear to be superfluous to the causal workings of the universe, such as being within two miles of a burning barn or being fourth placed in the Mushroom Cup on MarioKart in the guise of a gorilla, some minimalists also adhere to a Principle of Instantiation and reject all alien properties which are never instantiated in the actual spatio-temporal world. Alien properties, such as being a perfect circle or being threatened by a dragon on a Sunday, are rejected in favour of treating them as conceptual or ideal entities which are mind-dependent.

Minimalists disagree about how minimal the set of sparse properties should be, with some physicalist minimalists accepting only the properties of fundamental physics (whatever they turn out to be). However, if we restrict properties to this extent, we are left with the question of what a great many things which we thought were properties actually are. If being water or being square, being green or being a mouse are not properties, then they must be something else, since they form such a central position in our worldview that eliminating them entirely from the ontology is out of the question. It does not seem plausible to treat them in the same way that Armstrong does with alien properties and to maintain that they are mind-dependent or ideal.

At this point, it seems that a compromise is needed. Both minimalism and maximalism are viable in their own right, but as far as explanation goes, they lack precisely what the other can provide. The minimalist’s properties can account for the fundamental nature of reality and perhaps also the causal processes which occur in it, while the maximalist can explain higher level predication and give an account of explanation and predicate meaning. Ideally, the property theorists would like the best of both worlds.

There are two ways in which this compromise can be achieved: first, by a form of dualism about properties which treats sparse and abundant conceptions of properties as different categories of entities (Bealer 1982). There is a sparse population of properties (or ‘qualities’ as Bealer calls them) and an abundant one of concepts, which are not mind-dependent entities in the way in which we often think about concepts, but rather objectively existing entities.

Second, one could accept Lewis’s strategy and give an account of how the sparse properties determine the existence of the abundant ones. According to Lewis (1983a, 1986), there is a fundamental set of sparse, perfectly natural properties which determine the existence of all the other properties by set-theoretic, Boolean combinations. All other properties lie along a continuum, placed according to how simply they are related to the perfectly natural ones. Those which are closely related count as natural properties, with naturalness being a matter of degree which is determined by closeness to perfectly natural properties. If we suppose that the sparse properties are physical ones, then properties such as being green or being a mouse are both natural to some degree or other, as is (to a lesser extent) being fourth placed in the Mushroom Cup on MarioKart in the guise of a gorilla, but eventually naturalness trails off. Being green is more natural than being grue (where ‘grue’ is defined as being green if observed before 2085, otherwise blue) while being grue* is less natural still. (Being grue* is defined as being green if observed before 2030 or blue if observed between 2030-40 or red if observed between 2040-50 or pink if observed between 2050-60 or . . . and so on for 30 disjuncts (Elgin 1995).) The abundant properties exist in virtue of being determined by the sparse natural properties.

The ontological distinction which Lewis marks can also be characterized in other ways. For instance, Armstrong maintains that some universals are genuine ones, with the existence of other universals being determined by them. Such a distinction between perfectly natural sparse properties and the rest is a primitive one, however, and is thus not open to further analysis. If one considers parsimony to be an objective fact about the universe, then it is plausible to accept that some such minimal set of properties exists, but its existence has to be assumed rather than being argued for (McGowan 2002).

4. Problems with Instantiation

A particular is said to instantiate a property P, or to exemplify, bear, have or possess P. In the case of Platonic forms, the particular participates in the form of P-ness which corresponds to or is identified with the property P. One might wonder whether instantiation can be analysed further in order to give us some insight into the relationship between a particular and the properties which it instantiates, but it turns out that this is very difficult to do. In fact, instantiation runs into two major problems: the instantiation regress and problems about whether self-instantiation is possible.

a. The Instantiation Regress

The first problem arises if instantiation is treated as a relation. Presuming that relations are analogous to properties, or are a species of property, then the instantiation relation will behave in a similar way to a property. Let us say that particular b is P. If a relation of instantiation connects b with P, then b instantiates P. But then something must connect b, P and the instantiation relation (let us call it I1), and so there must be another instantiation relation I2 which does this job. However, now the question arises of what connects b, P and I1 with I2, and the answer must be that there is another instantiation relation I3 to do that; and then there must be another relation I4 to connect b, P, I1 and I2 with I3. For each instance of instantiation, we require another relation to bind it to the entities which we already have and so there will never be enough instantiation relations to bind a property P to the particular which has it. It appears that treating instantiation as a relation leads to an infinite regress, and so the instantiation relation is not coherent after all. (The instantiation regress is often associated with a regress suggested by F. H. Bradley (1893) and is thus sometimes known as ‘Bradley’s Regress’.)

There are several ways in which the property theorist might try to avoid this regress. First, she might appeal to the notion of an internal relation: that is, a relation which exists if the entities it relates exist. (Examples of internal relations include x being taller than y or x resembling y. All that is needed for such relations to hold is the existence of the things which they relate, Mount Everest and the Eiger for the former, for instance, or two black kittens for the latter.) However, one cannot say that instantiation is itself an internal relation because the existence of a particular b and a property P is not sufficient to determine that b is P. For example, the existence of a particular cat, Fluffy, and of the property of being white do not on their own guarantee that Fluffy is white; something more is required, in this case that Fluffy instantiates the property of being white. (Even if Fluffy is white, the problem here is that the relation between Fluffy and being white is a contingent one; Fluffy could exist and be black or tabby and so the mere existence of Fluffy and whiteness does not determine the existence of the instantiation relation. Although see Broad 1933, 85.)

David Armstrong argues that, while we cannot do without the first-order instantiation relation between particular and property, we can then treat whatever is required to bind particular, property and instantiation as being an internal matter. In terms of the example of the regress above, the additional instantiation relations, I2, I3 and so on, exist if particular b, property P and I1 exist such that b instantiates1 P. Nothing more is required, and the supposed regress is a cheap logical trick, rather than implying ontological infinitude. Armstrong claims that instantiation is a fundamental universal-like tie which is not open to further analysis.

Armstrong’s response depends strongly upon whether his account of internal relations is a plausible one. Do they provide, as he claims, an ontological free lunch (1989, 56; MacBride 2011, 162–6)? In addition, one might also question whether his solution works for every account of the ontology of properties. Armstrong’s account of instantiation is formulated for immanent universals—entities which are wholly present in each of their instantiations—but it is more difficult to think of instantiation as a fundamental, non-relational tie if it relates a particular to an abstract, transcendent universal, or to a resemblance class of which the particular is a member. If we are to treat instantiation as fundamental, then different accounts of the ontological nature of properties might require their own accounts of instantiation.

Alternatively, the property theorist might challenge the claim that the instantiation regress is vicious (Orilia 2006). If we further analyse the regress outlined above, we either require an infinite number of states of affairs to bind a particular to the property it instantiates, or each state of affairs (each particular’s instantiating a property) requires infinitely many constituents in order to exist (the particular, the property and infinitely many instantiation relations). Orilia distinguishes these as an external and an internal regress respectively, since in the former case the infinitude of additional entities is external to the original state of affairs of b’s being P, while the latter asserts that any state of affairs, such as b is P, does not simply contain b and P but infinitely many instantiation relations besides. Although this may not be what we intuitively expect of the relationship between particulars and the properties they have, one might argue that there is nothing ontologically wrong with such infinitude unless one has already presupposed that the world is finite. After all, we are happy to accept that the real numbers are infinite, such that there are infinitely many numbers between any two real numbers, and so it is not clear why such infinitude cannot occur in the natural world. There is, for instance, debate in the physical sciences about the existence of ‘real’ infinities (see Infinity, Section 4). If one allows that the world is infinitely complex, then the instantiation regress is not vicious, although its consequences for the way the world must be are quite counterintuitive (Allen, 2016, 29–31).

b. The Paradox of Self-Instantiation

It seems plausible to maintain that any property instantiates being a property, and furthermore (if one thinks that properties are abstract objects such as transcendent universals) that the property of being abstract instantiates the property of being abstract. It seems, in such cases, that it is possible for some properties to instantiate themselves and thus that there is such a property as being self-instantiating or a property’s instantiating itself. Moreover, the situation with the Instantiation Regress would be simplified if it were possible for instantiation to instantiate itself. That way, one might argue that the apparently infinite multitude of instantiation relations were in fact instances of the same relation, instantiated over and over again, with different numbers of relata each time on some versions of the regress. However, there is a logical problem with self-instantiation which has led some philosophers to suggest that self-instantiation should not be allowed.

Let us suppose that, for every property of being Q, there is also a negative property of being not Q. If this is the case, then there is a property of being non-self-instantiating or something’s not instantiating itself. But such a property appears to be logically impossible once we consider whether it instantiates itself: if the property of not instantiating itself does not instantiate itself, then it does instantiate not instantiating itself and so it instantiates itself. But if it does instantiate itself, then it is self-instantiating and so it does not instantiate itself. We have a paradox.

Faced with this paradox, one could take the rather extreme measure of banning self-instantiation entirely which would leave us in an implausible situation with respect to ‘properties’ such as being a property, which would not (strictly speaking) be a property. One might mitigate this consequence by introducing a theory of types for properties in addition to banning self-instantiation. Thus, we would have first-order properties which are instantiated by particulars, second-order properties which are instantiated by first-order properties, third-order properties which are instantiated by second-order properties and so on; each nth-order of properties can only be instantiated by the entities of the (n-1)th order. Being a property would then be a shorthand for being a second-order property (a property instantiated by first-order properties), or being a third-order property (a property instantiated by properties of first-order properties) and so on, and these properties do not self-instantiate. However, this hierarchy is perhaps too strict for daily use and conflicts with our intuitive judgments. For example, if a table instantiates the property of being crimson, it also instantiates the property of being red and being a colour; but the property of being crimson also intuitively instantiates being red and being a colour. However, if the theory of types is correct, we have to distinguish the first-order property of the table’s being red from the second-order property of crimson’s being red; different properties are involved in each case if we introduce a hierarchy.

Alternatively, one might solve the problem of self-instantiation by limiting which entities count as genuine properties and accepting a more minimalist position. This response rejects the premise that corresponding to every property Q, there is a property of being not Q which is instantiated just when Q is not. Thus, everything which does not instantiate the property of being red is not thereby not red, and we need not think that the property of not self-instantiating accompanies the property of self-instantiating. The paradox associated with there being a property of self-instantiation need not arise.

5. Categorical and Dispositional Properties

While Plato regarded participation in a form as making something the kind of thing it is, Aristotle also treated such kinds as giving a particular the causal power to do something, the potential to have certain effects. This contrast between understanding properties as qualitative, categorising entities and as dispositional or causally powerful ones survives in contemporary philosophy as the distinction between categorical and dispositional properties. We can conceive of a property such as mass in two contrasting ways: on the one hand, mass is a measure of how much matter a particular is made of; on the other, the mass of a particular determines how much force is required to move it, how much momentum it will have when moving and thus what will happen if it hits something else, and how much energy will be produced if the mass were to be destroyed.

Some philosophers argue that all dispositional properties are dependent upon categorical ones (Armstrong 1999; Lewis 1979, 1986; Schaffer 2005); others argue that all properties are dispositional and have their causal power necessarily or essentially (Cartwright 1989; Mumford 1998, 2004; Bird 2007; Marmadoro 2010a); some accept that a mixture of categorical and dispositional properties exist (Ellis 2000, 2001; Molnar 2003); and still others contend that all properties have a dispositional and a categorical aspect (Schroer 2013) or are both categorical and dispositional (Heil 2003, 2012). Dispositional properties, properties which have their causal roles essentially, are also known as dispositions, powers, causal powers and potentialities; however, it is important to note that these terms are not always used interchangeably.

a. Do Dispositional Properties Depend upon Categorical Ones?

There are three primary motivations for the view that all dispositional properties must depend somehow upon categorical ones: first, dispositional properties are regarded as epistemologically suspect, since we cannot experience a dispositional property as such. Second, dispositional properties are considered to be ontologically suspect. Third, it is thought that we do not need to think of dispositions or dispositional properties as being an ontologically independent category of entities because statements about the dispositional properties an individual instantiates can be analysed as conditional statements about the categorical properties which that individual instantiates, or else we can give an ontological account of how dispositional properties depend upon categorical ones. These issues are considered in turn.

The first motivation is more common within the empiricist tradition, but not exclusive to it. To say that a particular has a disposition or a causal power to do something does not entail that the causal power is actually manifested or that the effect is produced, since the particular may not be in the appropriate conditions for the effect to occur. For instance, although a particular sugar cube is soluble, such a disposition may never be manifested if the sugar cube is never near water; its being soluble ensures that it could dissolve, that it would were the circumstances to be right, and perhaps also that it must do so (although dispositionalists disagree about whether a causal power manifests itself as a matter of necessity in the appropriate circumstances). Thus, accepting the existence of irreducible dispositional properties involves accepting the existence of irreducible modality in nature, perhaps amounting to natural necessity, which makes each property produce its respective effects. As Hume pointed out, such natural necessity cannot be detected by experience, since we can only experience what is actually the case, and so strict empiricists have rejected irreducible dispositional properties on this basis. Some of those who think that at least some dispositional properties are irreducible to categorical ones accept this view about our experience and argue that we have other reasons to accept natural necessity, while others argue that we can experience irreducible modality in nature after all, perhaps through our own intentions being dispositional (Mumford and Anjum, 2011).

The second ontological objection to irreducible dispositional properties is raised by Armstrong (1997, 79) who argues that accepting dispositional properties commits one to Meinongianism. As noted above, any particular instantiation of a property which is the power to M may never manifest M; however, such entities are still construed as being powers to do M and are often individuated in virtue of their manifestations. For example, solubility is the power to dissolve, combustibility is the power to burn, and so on. In committing ourselves to the existence of unmanifested dispositions, the objector argues, we are also committing ourselves to the being (in some sense or other) of their manifestations, a range of entities which do not exist. In most cases, dispositional properties are constituted by relations between instantiated powers and a non-actual manifestation, which Armstrong argues is both ontologically uneconomical and absurd, reminiscent of the ontological commitment attributed to Alexius Meinong by Bertrand Russell (1905). On this basis, Armstrong concludes, essentially dispositional properties should be rejected. (See Mumford 2004, 192–5; Handfield 2005 452–461; and Bird 2007, 105–111 for responses.)

The third objection against irreducible dispositions is that we do not need to talk about dispositions and dispositional properties in the first place because we can translate disposition ascriptions into non-dispositional language. To that end, the conditional analysis of dispositions was first suggested by Carnap (1928, 1936–7), whose own account failed due to the fact that he insisted on analysing dispositions as truth-functional material conditionals. In Carnap’s proposal, we could analyse the dispositional predicate ‘is combustible’ as follows:

(C)  For any object o, if o is lit or otherwise ignited, o is combustible if and only if o burns.

The disadvantage of this account is that it provides a criterion to apply the predicate ‘is combustible’ only for objects which are ignited and says nothing about those objects which are not near any source of ignition. However, we intuitively want to say that the piece of paper on my desk is combustible and the water in the glass is not, whether or not these items are ever ignited. Carnap’s simple analysis leaves out the crucial aspect of dispositions and dispositional properties: the disposition or causal power to have a certain effect is present even when the disposition is not active and has no chance of being triggered because the requisite conditions do not obtain.

The failure of Carnap’s attempt to eliminate dispositional language led to more sophisticated accounts which attempt to analyse an object’s possession of a disposition in terms of subjunctive or counterfactual conditionals: that is, by capturing what the object would do were certain conditions to obtain (whether or not they do actually obtain). The most famous of these is the Simple Conditional Analysis which analyses disposition ascriptions as follows:

(CA) An object o is disposed to manifest M in conditions C if and only if o would M if C obtained.

(Ryle 1949; Goodman 1954; Quine 1960)

While this analysis is an improvement on Carnap’s attempt, there are several well-known counterexamples to it. First, the stimulus conditions may obtain and the disposition not manifest because the effect is masked. For instance, the paper is combustible because it would light were certain stimulus conditions to obtain (were it to be in contact with a source of ignition), but the disposition will not manifest if the atmosphere around it contains no oxygen; the lack of oxygen will mask its combustibility. Second, we can imagine a situation in which the presence of the conditions required for the disposition to manifest removes the disposition somehow; in our current example, perhaps the presence of a source of ignition also causes the paper to be soaked by water, making it, while wet at least, no longer combustible. A disposition where the presence of the requisite triggering conditions results in an object’s either acquiring or losing a disposition is known as a finkish disposition, following Martin (1994). Third, we can find examples in which the effect of a disposition is mimicked when the triggering conditions occur, even though the disposition is not present. For instance, consider Lewis’s famous Hater of Styrofoam (1997), who breaks Styrofoam containers each time they are struck, giving the impression that such containers are fragile when they are not. Such examples show that (CA) can be true while intuitively the dispositional predicate ‘is fragile’ should not be ascribed to the object; the conditional can be true when the disposition is mimicked.

Difficulties with the Simple Conditional Analysis have led to refinements in this approach (Prior 1985; Lewis 1997; Manley and Wasserman 2008), although the Simple Conditional Analysis still has defenders who challenge the counterexamples of finks, masking and mimicking (Choi 2008). However, the complexities of eliminating dispositional ascriptions by analysing them as conditionals have encouraged many contemporary philosophers to take another look at the plausibility of treating dispositional properties more realistically, either as entities which depend for their existence on categorical properties and other entities, or as an independent ontological category.

b. Dispositional Properties from Categorical Ones

Armstrong takes a minimally realist attitude to dispositions: the dispositions which an individual has to act in this way or that are entirely determined by the categorical properties they instantiate and the laws of nature which govern them. Although such dispositions are real, they are a derived category of entities, not a fundamental one, since they are ontologically dependent upon categorical properties and laws. For Armstrong (1983), laws of nature are necessary connections holding between universals (which, as was noted above, Armstrong considers to be the ontological basis of properties) but these necessary connections can vary across different possible situations. Although in the actual world it is true that the instantiation of an F necessitates the instantiation of G, this necessary connection need not hold in counterfactual situations; in another possible situation, F may necessitate the instantiation of H instead of G. Thus, what a property does is determined by which laws obtain in the world in which it is instantiated, not by that property’s intrinsic nature. In Armstrong’s view, categorical properties and laws of nature are more fundamental than the dispositions they confer, and the causal disposition a property has is contingent upon what the laws of nature are in the world in which it is instantiated. Thus, what a property has the power to do can vary in different possible situations. (See Contessa 2015 for a criticism of this view.)

c. Dispositional versus Categorical Properties

Central to arguments about whether we should conceive of properties as categorical or dispositional are clashing intuitions about whether it is plausible for a property P with the causal power to do C1 in the actual world to have the power to do C2 in another possible world w. If so, and if this indicates a genuine possibility, then property P does not have its causal power as a matter of necessity; if this is not possible, then properties do have their causal roles necessarily (or because of their essential nature, if this is different) and are thus dispositional. For instance, in the actual world, particulars with like charges—such as two electrons instantiating negative charge—repel each other. But, is it possible that like-charged particulars could attract each one other? The supporter of categorical properties says ‘yes’ whereas someone who favours dispositional properties says this is not possible. The supporter of dispositional properties maintains that if there were a property which could make electrons attract, it would not be charge but a distinct property, schmarge (say). Since schmarge does not exist in the actual world it is an alien dispositional property, and rather than accept existence of alien properties, some dispositionalists prefer to deny the possibility of electrons attracting.

The empiricist’s suspicion of the natural necessity inherent in dispositional properties is largely based upon an epistemic argument: how can we justify believing that such natural necessity exists, especially since we cannot find out about it through experience? However, the dispositionalist employs a converse epistemic argument which notes that the supporter of categorical properties also postulates entities which lie outside our epistemic grasp: if a property P can have different causal powers C1 and C2 in different possible situations, then the property itself must have a purely qualitative nature or quiddity which is only contingently associated with anything which P can do. Moreover, one and the same causal power C1 can be associated with distinct categorical properties P and Q, and so it is not clear how we determine that one property is being instantiated rather than another. It is plausible to think that we have experiential access to properties only via the effects which they have on us, but this makes the nature of quiddities as mysterious as natural necessity (especially from an empiricist perspective).

d. Explanatory Uses for Dispositional Properties in Metaphysics: Laws and Modality

These arguments are taken to establish the position that at least some properties are dispositional rather than categorical. This position, it is argued, has significant explanatory advantages for metaphysics considered more broadly. First, if properties essentially or necessarily involve having a specific causal role, then the causal relations between properties remain stable and the properties of an object bring about certain effects as a matter of necessity. These fixed relations between properties permit an account of causal laws as derived entities, which hold in virtue of dispositional properties and which hold as a matter of necessity (Mumford 2004). This, it is claimed, is respectively more coherent or more parsimonious than the accounts of laws available with an ontology of categorical properties which treat laws either as simply being contingent regularities holding in virtue of the distribution of properties in a world (Lewis 1973, 1994) or else require the postulation of second-order relations holding between properties or universals to act as laws of nature which govern what those properties do (Armstrong 1983).

Second, some supporters of a dispositional conception of properties argue that the essential, natural modality which such entities involve can be used to give a naturalistic account of possibility and necessity (Jacobs 2010; Borghini and Williams 2008; Vetter 2015). The dispositional properties which an individual instantiates determine what that object could do, and also what it must do in certain circumstances, thereby providing truthmakers for modal statements about that individual. Thus, the truth of statements such as ‘This coal could burn’ or ‘Hillary Clinton could be a physicist’ are made true by the dispositional properties which these individuals instantiate or by properties which actually instantiated dispositional properties that have the power to instantiate. This dispositionalist account of modality has, according to its supporters, the resources to provide an account of modality without recourse to abstract objects or to possible worlds. Furthermore, since some dispositionalists restrict what is possible to what is possible given the dispositional properties which exist, have existed and will exist in the actual world, this account of modality is an actualist one; it does not require ontological commitment to the existence of merely possible entities.

Although the formulation of these dispositionalist accounts of modality is still in the early stages, they already face some significant challenges. The primary difficulty concerns whether an ontology of actually instantiated dispositional properties can provide a broad enough modal range to match our common-sense intuitions about what is possible. For instance, logical and mathematical truths appear to be necessarily true, but we do not readily think of them as being made true by actual dispositional properties or causal powers. ‘2 + 2 = 4’ is always true, and intuitively could not be false, but it is not obvious what in the world makes it that way, nor whether it is coherent to say that everything has the disposition to make such statements true. The dispositionalist has given an account of logical and mathematical necessities in terms of dispositional properties to permit an alternative account of them. (See Vetter 2015.)

Furthermore, claims such as ‘Dinosaurs could have developed digital technology’ or ‘If Coulomb’s Law is false, these two proximate negative charges would not repel’ present difficulties: the first because it is an unactualised possibility which seems very unlikely given the dispositional properties instantiated now or in the past, and the second because it is a counterlegal possibility, a possibility which concerns a situation which could only occur were the laws of nature in the actual world to be false. The dispositionalist can deal with the former type of example by allowing that possibilities are not only grounded by which dispositional properties are actually instantiated, but also by the dispositional properties which these actually instantiated properties could produce, and the ones which these latter, uninstantiated properties could produce, and so on. Thus, it does not matter that no dinosaur actually had the power to invent digital technology, nor that nothing actually has the power to cure cancer, because the possibility rests on something existing (or having existed) which has the power to produce the power to do so.

On the other hand, examples of counterlegal possibilities have proved a more intransigent problem for dispositionalist modality. If, as was noted above, the dispositionalist thinks of natural laws as being entirely determined by the dispositional properties or causal powers which the world instantiates, the actual dispositional properties instantiated in the world cannot also determine possibilities which run counter to those laws. It makes no sense to imagine that the world could have been exactly like the actual one and yet the laws of nature be different. If the dispositionalist wants truthmakers for counterlegal possibilities, then she must be committed to the existence of alien causal powers, ones such as schmarge, which are uninstantiated in the actual world. However, if the dispositionalist makes this move, then her theory has lost the advantage that it claimed over other theories of modality, since it is now committed to the existence of possibilia or abstract objects in order to ground modality. Given this, most dispositionalists restrict what is possible to what is possible given the causal powers which exist, have existed or will exist in the actual world, thus denying possibilities which could occur only if the actual laws of nature were false. In doing so, they accept that some intuitively plausible possibilities, such as ‘It is possible that this one kilogram of gold will not fall towards the Earth when it is unsupported’, are not genuine possibilities at all; the gold might not fall were the universal law of gravitation not to hold, but in this version of actualist dispositionalism, this law holds necessarily; situations in which there is no gravity are not genuinely possible. (Although see Borghini and Williams 2008 and Vetter 2015, who suggest that actual powers or potentialities might be able determine possibilities which go beyond those permitted by the current laws of nature.)

Not all dispositionalists concur with the use of their ontology to ground necessity and possibility in this way. Mumford and Anjum (2011) have suggested an alternative account which argues that dispositions act with a sui generis modality—dispositional modality—which is weaker than necessity and yet stronger than contingency.

e. Problems with Pan-Dispositionalism

Pan-dispositionalism—the view that all properties are dispositional ones—faces several challenges to its coherence. First, there is the complaint that even among the natural properties, some properties are obviously not causal powers: properties such as being a cube or being red are not obviously ones which are essentially causal. The pan-dispositionalist’s answer is usually that such properties are dispositional after all: colours are properties with the power to cause certain wavelengths of light to be reflected, or to cause a specific reaction in ourselves and other animals, and being a cube is associated with various effects such as not being able to roll, being stackable, making a certain imprint in soft clay, and so on. The dispositionalist might add that such properties are continuously manifesting (Hüttemann 2013), which gives the appearance of there being a distinct set of categorical properties.

Second, the pan-dispositionalist ontology is vulnerable to the ‘always packing and never travelling’ objections: dispositional properties are potentialities to have certain effects, but if their manifestations consist in the production of more dispositional properties, the manifestation of the potential of a power consists in the production of more potentialities. (See Molnar 2003, 11.2 for variants of this problem.) This is an ontology of potentialities which ‘never passes from potency to act’ (Armstrong 2004). The critic of pan-dispositionalism argues that such powers must be supplemented by categorical properties to give the world actuality or being, or in order that actual events occur, rather than just the passing of potencies around. For instance, Heil argues that the world cannot be one in which properties are nothing more than contributions to what their bearers have the power to do because such bearers would be indistinguishable from empty space; there would be doing but no being, and this, Heil urges, does not make sense because there would be nothing to do anything at all. According to Heil, a purely dispositionalist ontology would be equivalent to an empty universe.

This objection could be met by accepting a theory in which properties are both qualitative and dispositional (Heil 2003, 2012; Schroer 2013), by permitting continuously manifesting dispositional properties which are analogous to categorical ones, or else by denying the need for a fundamental level (Schaffer 2003). However, Mumford (2004, 174–5) implies that these responses are not required, since the objection is based upon a misunderstanding of what being an essentially dispositional property or power involves, treating these entities as actual only in virtue of their producing actual manifestations. As Mumford argues, being potent (as these entities are) is a way of being and so it is wrong to think of pure powers as being mere potentialities in the first place.

Despite these difficulties in the formulation of a pan-dispositionalist ontology, it is thought by its supporters to have significant explanatory advantages over its rival which treats properties as categorical. The primary reasons for this are that dispositionalists can invoke the irreducible modality in nature in order to explain the necessity of causation and natural laws (Mumford 2004), or to ground an actualist account of modality which permits us to explain what is necessary and what is possible in terms of actually existing properties (Jacobs 2010; Borghini and Williams 2008; Vetter 2015).

6. Properties and Natural Kinds

The world appears to contain kinds of stuff as a matter of natural fact: water, elephants, gold, carbon dioxide, humans, red dwarf stars and so on. We can class these as ‘natural kinds’ and they are especially useful for making inductive inferences to be used for prediction and explanation. What exactly is the relationship between these kinds and properties? Some philosophers, with an exceptionally relaxed view of kinds (or a minimalist view of properties), argue that kinds and properties coincide: that is, that something’s being of a certain kind K simply involves the instantiation of a property and vice versa. However, although it is intuitively plausible to associate kinds with properties in some way, there seem to be more properties than there are kinds. Carbon, elephants, or stars each behave in a variety of ways in virtue of belonging to their respective kinds, while red things, or those which have a mass of 1.1 grams, display a much more restricted range of causal behaviour. Nevertheless, one might still think that this difference is a difference of degree (Bird 2014, 2).

Furthermore, if we do not restrict ourselves to what might be considered natural properties, the mismatch between properties and kinds is magnified. If we are trying to characterize what makes something a natural kind, there are plenty of properties—especially in an abundant conception of properties—which do not seem to be very natural. If it is contentious to consider green things as forming a kind, it seems even more so to include grue ones, or those which instantiate properties such as being on the eighth page of the first novel I read this year, being married to an ice-hockey fan, or being next to a marmoset. In view of this problem, one can either declare that the sharing of such properties does not mark out individuals as a kind or that there are some kinds which are non-natural ones. If one chooses the latter option, there may be further questions about how individuals of such non-natural kinds relate to the properties which they instantiate.

The simplest explication of a natural kind is that the individuals which belong to it share a property or a collection of properties (with some properties being excluded, as noted above). A subset of natural properties, or comparatively more natural properties if one prefers Lewis’s account of property naturalness, determines which natural kinds there are. In this view, natural kinds would be a derivative category and one might choose to dispense with them entirely in favour of the properties or collections of properties which are essential to each individual of the kind. In this view, the kind water is coextensive with having the property of being H20; and we might call the latter the essence of water.

However, this essentialist view is difficult to sustain in the case of many paradigmatic examples of natural kinds, such as species. It is impossible to characterize exactly which properties determine that an individual tiger is a member of the kind tiger, in the sense of giving the properties which are necessary and sufficient for membership of the kind. Furthermore, because species evolve over time, there is not a good reason for thinking that the failure to find a set of properties which are necessary and sufficient for kind membership is an epistemological problem rather than an ontological one. The essentialist account of kinds does not easily account for kinds which appear to be able to change their natures.

Richard Boyd has suggested a characterisation of kinds which might be able to account for such changes in terms of the properties which exist (Boyd 1991, 1999; Millikan 1999). He argues that an entity is a natural kind in virtue of its being a cluster of properties which are commonly instantiated in the same individual, where such clusters are formed and maintained by a homeostatic mechanism. Such mechanisms are either intrinsic to the property cluster because some collections of properties are internally more stable than others, or they are extrinsic and the property cluster is maintained in a fairly stable state by the environment or some other causal mechanism. No property of the cluster need be necessary to the kind, nor need there be any property which is sufficient for kind membership, which allows for the existence of kinds which lack essences. Kinds can change because their individual members lose or gain a property, or because the extension of the kind changes such that novel individuals are included within it. Nevertheless, Boyd argues, the clustering occurs because such changes from a stable cluster have a lower chance of persisting. Thus, we can explain why the members of a species maintain the properties which they do while their environment remains stable and why they evolve as the environment changes when mutations may have a greater chance of survival.

7. Different Types of Properties

There are several useful distinctions between different types of properties. Often these are made to mark a metaphysical distinction between them, to draw attention to the fact that these different types of properties behave in significantly different ways in the same circumstances, or in order to treat them theoretically in different ways. The distinction between categorical and dispositional properties is one such distinction, which has been discussed at length above. Others are considered much more briefly in this section. In addition, the table at the end of this section includes definitions and examples of other types of properties.

a. Intrinsic and Extrinsic Properties

There is a kiwi fruit in my fruit bowl which has a huge variety of properties. It is (roughly) ellipsoid, brown, slightly hairy, bright green and white inside, it has black seeds, it is sweet, soft, contains about 10g sugar and 1g protein, weighs 63 grams and is 5cm in diameter. It is lying next to an over-ripe pear, was grown in New Zealand, is partially obscured by the electricity bill, has travelled farther than I have in the last year, is not Hilary Clinton, it has no beliefs about classical logic, and is being used in a philosophical example.

Intuitively, the properties listed in the former sentence are more important than those in the latter: the difference between the kiwi fruit and the pear is not marked by the fact that one was grown in New Zealand and the other was not (although that happens to be true), and because neither of them are Hilary Clinton and both are partially obscured by the electricity bill, those properties cannot be what mark the difference either. It would make no real difference to the kiwi fruit or its continued existence if the bill were moved from on top of it, but it will change if I get a knife and slice it in half. Not only do the properties in the former set seem to be what determine the real difference between the kiwi fruit and other things in the world, those properties are more likely to be causally efficacious: the kiwi fruit is nutritious because of them, will roll when put on a slope, and can be used to knock over small objects if your aim is good.

It would be philosophically useful to draw a distinction between the properties which (roughly speaking) a particular has in virtue of itself, its own nature, and those which it has due to its relations with other things: that is, those which are intrinsic properties and the extrinsic ones. But can we draw a principled distinction between them? Several bases for such a distinction have been suggested: some attempt to be purely logical and to avoid any commitment to a particular metaphysical position, whereas others can be classed as metaphysical criteria because their plausibility requires that one make certain assumptions about the way the world is.

It is worth noting that some properties can be intrinsic when instantiated by some individuals and extrinsic when instantiated by others. These properties are locally intrinsic or extrinsic. For instance, consider the properties being such that a dog exists or becoming nervous when encountering a dog. In either case, these properties will be extrinsic when instantiated by anything which is not a dog, but intrinsic when instantiated by a dog, thus they are locally intrinsic properties. In what follows, the use of ‘intrinsic’ is confined to properties which are intrinsic when instantiated by any individual.

Lewis suggests that his ontologically elite perfectly natural properties are good candidates to determine intrinsicality. These properties, as we saw above (3b), are the most fundamental ones and ground the existence of other properties which are natural as a matter of degree. Perfectly natural properties determine the objective similarity and difference in the world, and thereby determine whether particulars are duplicates of each other or not. Intrinsic properties are just those properties which duplicates must share. Particulars can be duplicates of each other and differ in extrinsic properties.

However, accepting this criterion depends upon accepting Lewis’s claim that there is a set of such fundamental properties and, secondly, that those properties are intrinsic ones. Neither of these claims are without their detractors. The first claim is vulnerable to criticism from both maximalists about properties and those who deny the existence of a fundamental level to reality. Lewis’s second claim that all fundamental properties are intrinsic has been challenged on the grounds that some seemingly fundamental physical properties such as gravitational mass or spin might require the existence of other particulars to be instantiated. (See Bauer 2011; Allen 2018.) Moreover, even if one accepts Lewis’s minimalist metaphysical account of what the world contains (or something fairly close to it, such as Armstrong’s genuine universals), one might worry that ‘intrinsicality’ has been very closely inter-defined with ‘duplicate’ in this case: duplicates share all their intrinsic properties, while intrinsic properties are those shared between duplicates. Even if this criterion is correct, it does not go a long way towards explaining what an intrinsic property is.

Jaegwon Kim (1982) suggests that we can characterize the distinction in terms of loneliness: intrinsic properties are the properties a particular would have even if nothing else existed in the world. (This criterion requires only that no other contingently existing objects exist and does not exclude necessarily existing particulars, if there are any, such as numbers.) However, although an object’s being lonely is intuitively an extrinsic property, since being lonely depends for its instantiation on the absence of contingently existing objects, it turns out to be an intrinsic property in Kim’s criterion (Lewis 1983b, 198–9). Langton and Lewis (1998) suggest amending Kim’s criterion: an intrinsic property is one whose instantiation is independent of loneliness and accompaniment; that is, it is a property which can be possessed or lacked by a particular regardless of whether or not any distinct, contingently existing objects exist. However, this criterion is still not adequate, since some properties such as being spherical and lonely or non-spherical and accompanied turn out to be independent of loneliness and accompaniment, and thereby would count as being intrinsic. Langton and Lewis rule these disjunctive properties out by fiat, by characterising disjunctive properties as those which have disjuncts which are more natural then they are. (Recall Lewis’s account of naturalness in 3b above.) Accordingly, an intrinsic property is one which is independent of loneliness and accompaniment, and also is neither a disjunctive property nor the negation of a disjunctive property. As with Lewis’s original criterion based on duplication (which he does not reject in favour of the new criterion), Langton and Lewis’s criterion is a metaphysical one because it requires commitment to some kind of property hierarchy.

One might also be concerned about the scope of Langton and Lewis’s criterion since they specifically state that their criterion omits properties which involve particular entities, which they call impure properties, such as being Nelson Mandela or being more than fifty kilometres from Juba. In addition, the criterion makes all indiscriminately necessary properties—such as being such that 2 + 3 = 5—intrinsic as long as they are not disjunctive. (Lewis’s original duplication account, on the other hand, treats all indiscriminately necessary properties as intrinsic.) If this is the case, each particular has infinitely many more intrinsic properties that we would usually be inclined to attribute to it. One could exclude indiscriminately necessary properties from the criterion as well as impure properties, but the consequence of that would be an even less general criterion than before. In response, some philosophers have called for a more general criterion to distinguish between intrinsic and extrinsic properties which is able to take all properties into account.

One attempt to distinguish intrinsic and extrinsic properties on purely logical grounds is by defining extrinsicality. The instantiation of an extrinsic property by an individual consists in its bearing certain relations to at least one distinct individual, while properties which do not do this are intrinsic. We can call the former d-relational properties and maintain that properties which are not d-relational are intrinsic (Francescotti 1999, Harris 2010, 467). There are drawbacks to this account as well, however. First, it is not obvious that one can determine what counts as a ‘distinct individual’ without recourse to intrinsic and extrinsic properties, or else by introducing a metaphysical element into the criterion. If one individual’s being distinct from another requires their not having intrinsic properties in common, then we have made no progress. Second, one might be concerned about how we should deal with d-relations to abstract objects. If an individual can be d-related to abstract objects, then some properties turn out to be extrinsic which seem intuitively to be intrinsic: for instance, the sugar’s weighing 1 kilogram is extrinsic if 1 is an abstract object; in fact, all measurement properties would turn out to be extrinsic properties. On the other hand, if we accept that an individual’s relations to abstract objects cannot make the properties it instantiates d-relational, then indiscriminately necessary properties such as being such that 37 exists all turn out to be intrinsic, and this is another outcome we might hope to avoid.

As these and other suggested criteria have all turned out to be unsatisfactory, some philosophers have suggested that our intuitions about intrinsic and extrinsic properties are unstable and involve more than one division between properties. In this vein, Marshall (2016) suggests that intrinsicality covers three related types of properties: interior properties associated with an individual’s internal nature; properties preserved in duplication; and local properties which are necessarily ascribed to an individual on the basis of how it and its parts are. These, it is argued, play different roles in metaphysical explanation.

b. Accidental and Essential Properties

An individual can survive the loss of some properties and still retain its identity, while other properties are essential to it; were it to lose one of these latter properties it would no longer be the type of particular that it is. We can call the former properties accidental properties and the latter essential ones. For example, a dog is usually larger than a rabbit, has four legs, is domesticated and can swim; it also has a DNA profile similar to that of other dogs and has parents who are also dogs. A particular dog could lose a limb or be unable to swim, and it would still count as being a dog. But were an animal not to have dogs for parents, we would be unlikely to consider it to be a dog. (This example is employed for simplicity, but as noted above in Section 6, species are not really good examples of this distinction, since it is not obvious that there are properties which are essential to being a certain species.) Similarly, it is essential to a piece of gold that it has atomic number 79, but accidental that it is liquid or that it weighs two grams. The former essential property is shared by everything which counts as gold, whereas the latter properties are instantiated by the particular qua gold as a matter of contingent fact.

What is being given here is a modal characterisation of the distinction between accidental and essential properties: the former are those which a particular could lack while still being of the broader type that it is, while if something lacked its essential properties it would cease to exist (at least as the type of thing which it is). To put the point another way, a particular cannot lack its essential properties. Essentialism is the view that at least some particulars have essential properties.

At first glance, the modal characterisation of the distinction between accidental and essential properties fits well with our common-sense intuitions; the properties without which an individual could not exist seem intuitively to capture the essence of that individual. But this characterisation has been challenged because on closer inspection it turns out to classify a range of properties as essential which do not contribute to making a particular the kind of thing that it is. For instance, in this characterisation of the distinction, essential properties will turn out to include all of what we call indiscriminately necessary properties. These are properties which everything has, such as being such that 37 is prime number or being such that the ratio of the circumference to the diameter of a circle is Π. Since these properties are instantiated by everything, they do not intuitively contribute to making each individual what it is; they are not intuitively part of its essence. Furthermore, as Kit Fine (1994) pointed out, each individual has more specific properties necessarily which do not appear to determine that individual’s essential nature. For example, Socrates has the property of being the sole element of the singleton set containing Socrates (that is, being the sole member of {Socrates}), but that property is not, one would think, an essential property of Socrates the man. Fine argues that these examples are enough for us to abandon the modal characterisation of the distinction for an alternative.

In view of this problem, amended accounts have been sought, including Fine’s own suggestion which is that essential properties contribute to the definition of an object, or amended modal criteria which attempt to rule out the problematic properties on the grounds that they are not intrinsic to the individuals in question (Denby 2014), are not locally necessary to the individuals (Correia 2007), or are not sparse properties (Wildman 2013, Cowling 2013). (See also Zalta 2006 for an alternative approach.) As with the attempts to distinguish intrinsic from extrinsic properties, there is a danger of close inter-definition here, and consequently one of circularity: it may not be possible to characterise the intrinsic-extrinsic distinction (say) without a grasp upon the essential-accidental distinction or the distinction between sparse and abundant properties, and vice versa, making the resulting explanations quite impoverished. From an ontological point of view, however, such inter-definition is acceptable but one might feel justified in following Lewis and simply assuming that the characteristics of intrinsicality and sparseness go together, alongside being an essential property when such properties are present.

c. Monadic and Polyadic Properties

Thus far, this article has been primarily concerned with properties which, on each instantiation, are instantiated by one individual: properties such as being blue, being a cube, being an electron, or being a dog. These are monadic properties. However, many properties appear to require more than one individual to be instantiated: Edgar is friends with Julia, the cat is inside the box, Amir is in between Julia and Edgar, Julia is in the same class as Amir and Marie, and 2 is a common factor of 8, 10 and 12. These properties are more commonly known as relations, since they determine how one thing (or more) stands to others. But because they usually require more than one individual to be instantiated (or else, they relate one individual to itself), they are also known as polyadic properties, with their adicity capturing how many individuals are required to instantiate the property: Edgar is friends with Julia is the instantiation of a dyadic property, while being in between is a triadic property instantiated by Amir, Julia and Edgar, and so on.

The predicates of our natural languages allow for many cases in which the number of argument places of a predicate (its degree) is variable: ‘is friends with’ is two-place in the example above, but as ‘are friends with each other’ it could be three-place, four-place, five-place or more; similarly, ‘being in the same class as’ or ‘being a common factor of’ can vary in degree. In most formal logic, the degree of a predicate is fixed (for an exception, see Orilia 2000), but if we use natural, rather than formal, language as a guide to ontology, we might be tempted to think that the properties which correspond to these predicates can vary in their adicity. These are variably polyadic or multigrade properties which admit of a different number of participants in different circumstances.

We can distinguish internal relations from external ones (although philosophers disagree about what exactly they mean by ‘internal relation’). Briefly put, an internal relation is a relation which exists if its relata do. For instance, Ben Nevis is taller than Snowdon, but nothing more is needed for the is taller than relation holding between them than the existence of the two mountains at the heights which they actually are. On the other hand, being friends with each other is an external relation: the mere existence of Edgar and Julia is not sufficient to ensure that they are friends as they might never meet or may not get on; the relation of their being friends with each other exists in addition to the existence of its relata.

Internal relations (and hence the distinction between internal and external relations) are characterised in slightly different ways. For instance, Armstrong maintains that a relation is internal if its existence is necessitated by the intrinsic natures of its relata (1997, 87–9). For instance, in the case of Ben Nevis and Snowdon, their intrinsic properties of being the height that they are necessitates the existence of the relation of Ben Nevis being taller than Snowdon. On the other hand, Lewis claims that an internal relation is one which supervenes upon the internal nature of its relata. An earlier version of the distinction, proposed by G. E. Moore, is that a relation R between entities b and c is internal if the existence of b necessitates that b bears the relation R to c (1919, 47). Thus, in Moore’s case, only the existence of b is necessary for the relation between b and c to hold. Moore’s kind of internal relation has sometimes been distinguished as ‘super-internal’ where the existence of R is necessitated only in virtue of b’s intrinsic properties, or as simply a ‘one-sided’ relation when extrinsic features of b might also be relevant to necessitate the existence of relation R between b and c (see Bennett 2017, 192–4). Because internal relations exist if their relata do, their addition to the ontology (and employment in metaphysical theories) requires no additional ontological commitment over and above the entities they relate (and a general commitment to the existence of such relations). Thus, they have been described by Armstrong as ‘an ontological free lunch’ (1989, 56).

From a historical perspective, relations were not considered to be real entities, with the underlying motivation for this being the conviction that they could be reduced to or supervene upon monadic properties. However, such a reduction has never been fully explained. Furthermore, relations are regarded as being philosophically problematic for at least two reasons. The first is that even when external relations are instantiated, it is not clear where they are: Bangalore is south of New Delhi, but the relation being south of is not one of the properties which these two cities instantiate individually, so it is not located entirely where either of the cities is, and so one might wonder where the relation is. Perhaps its location is somehow divided between its relata, but it must be divided in such a way that the relation can be considered as one unified entity. Furthermore, Heil complains that relations do not fit neatly into our ontological categories of substance or attributes, that they are ‘neither fish nor fowl’ (2012, 141). But neither of these complaints counts decisively against the existence of irreducible relations: if they exist, they simply have to exist (and to have their location) in a way different than either substances or monadic attributes. Like Armstrong’s immanent universals which are wholly present in each of their instantiations, relations are not bound to behave in the same way as the objects and properties of ordinary middle-sized objects.

Another objection threatens the existence of external relations, a version of which was discussed in 4a. This is known as Bradley’s Regress (1893, 32–3). If relation R genuinely relates objects b and c, then R must be something to b and c. However, if R is something to b and c, then there must be a relation R’ which captures the relation between R and b and c. However, if R’ genuinely relates R, b and c, then there must be another relation R’’ which relates R’ to R, b and c; which in turns requires the existence of another relation R’’’, and so on. There is a regress of relations and thus, argues Bradley, the existence of external relations is impossible.

There have been some attempts to solve Bradley’s Regress using relational tropes (Maurin 2010, 321–3) or facts (Armstrong 1989, 109–10); but, as MacBride has argued, these strategies rely upon assuming the coherence of relations in the first place (2011). Russell, on the other hand, adopts the alternative strategy which highlighted the indispensability of relations, such as spatio-temporal relations, to science (1924, 339). It is more likely, he argues, that there is something wrong with Bradley’s regress argument than that we are wrong to take so much of our fundamental science at face value.

A challenge for any philosophical account of relations, assuming now that they can be construed realistically, is how we should understand how non-symmetric relations make a contribution to different states of affairs. The same constituents—Edgar, Julia and the relation of seeing (for instance)—can form two distinct states of affairs: Edgar sees Julia and Julia sees Edgar, which differ in relational order or differential application. Russell (1903, 218) became interested in giving an account of this relational order, a question which has been taken up in contemporary metaphysics (Hochberg 1987; Fine 2000; Orilia 2011). One might think of the difference between the two states of affairs as being explained by the relation having a direction, of the relation being directed from one relatum to another; or one might think that the positions or argument places of the relation are occupied in different ways. In this case, the argument place occupied by the one being seen is different from the one doing the seeing. Fine criticises these two accounts and suggests his own, non-local account of how we can explain differential application in terms of the other states of affairs into which a particular relation enters. Alternatively, MacBride has suggested that we should accept relational order as primitive, in the same way that most philosophers who accept real external relations avoid Bradley’s Regress by simply assuming that the fact that b relates c does not require further explanation (2014).

d. Determinable and Determinate Properties

 Being vermillion or being crimson are specific cases of being red, which is itself a specific case of being coloured. Similarly, being triangular is a case of being shaped, and having a mass of 1.06 kilograms is a specific instance of having mass. This relationship between properties such as being coloured and being red, and then between being red and being crimson, is known as the determinable-determinate relation, where colour is the determinable and crimson is the determinate instance of it. Given that a property, such as being red, can be determinable and determinate, a property’s status as determinable or determinate is usually regarded as relative matter. The different determinates of a particular determinate often exclude one another (if something is red, it cannot be blue or green), and this was thought to be a defining feature of a determinable and its determinates, although this is not always the case, since one can argue that different determinate odours or tastes are compatible with each other (Armstrong 1978b, 113). Nevertheless, even in cases where determinates do exclude each other, the determinable does not appear to be simply the conjunction of all the determinates but something over and above that.

One philosophical question which arises as a result of this distinction is what the relationship between determinables and determinates is. One can be a realist about both determinates and determinables, at which point the further question arises about whether determinates are more ontologically fundamental than determinables; one can be a reductionist about determinables; or one can be an anti-realist about determinables.

One might wonder whether there are any ontologically irreducible determinable properties on epistemic grounds: perhaps we only have to refer to determinable entities such as colour and shape because of our perceptual or cognitive limitations. It is too complicated to think about the world in maximally specific terms, or we do not have the perceptual apparatus to be able to detect such maximal specificity; however, in the absence of these limitations, we would not require determinables. For example, see Heil (2003). However, for this argument to be plausible, and for the reduction or elimination of determinables to be possible, the world must be absolutely determinate and without metaphysical vagueness, and this too is a matter of philosophical debate. Nevertheless, the ontological conviction that the world is maximally determinate is an important motivation for reductive or anti-realist views.

On the other hand, the reality of irreducible determinables is problematic since it is not obvious that we can perceive determinables as such: we perceive shape in virtue of perceiving specific shapes, or colours in virtue of perceiving determinate colours. We do not seem to be aware of determinables as objects of our perceptions.

However, Prior (1949) suggests that determinables must be more than their determinates because determinates are similar with respect to those determinables: red, blue and orange are similar with respect to their colour as are being triangular and being oval with respect to their shape. For this respect to exist, one might argue, determinables must be ontologically independent of determinates and must be real. Furthermore, this ontological point is exploited by Fales to improve the epistemological situation with respect to determinables. He notes that we can perceive the specific similarity between determinates, and in doing so we must be indirectly aware of determinables (1990, 172).

A second argument for the existence of determinables comes from their role in laws of nature and the fact that they are postulated in scientific explanations. For instance, we think of Newton’s second law as holding between the determinables mass, force and acceleration, rather than there being infinitely many laws holding between determinate instances of these determinables. Furthermore, in chemical laws, the relevant relationship holds between determinables (between acids and alkalis, to give a simple example), and one might argue that the specific molecular features of the determinate substances are not important (Batterman 1998).

Realists about determinables have presented a variety of accounts, including an essentialist account (Yablo 1992) which treats determinables as having essences which are contained within the essences of their determinates; accounts based on the causal relations of the determinables being a subset of those of the determinates (Fales 1990); and a causal powers-based account in which causal powers of a determinable are a subset of those of any and all of its determinates (Wilson 1999).

The main version of reductionism about determinables treats them as disjunctions of all their determinates: being coloured is equivalent to being red or being blue or being green or . . . . One objection which is raised against this view is that it does not match the way we think about determinables. Moreover, it seems that someone might fully understand a determinable such as colour while having no conception of all the disjuncts of the disjunction (all the different colours) which make that determinable. In such cases it is not obvious how the reductionist can maintain that such a person understands the determinable in question. Furthermore, the assumption that the world is maximally determinate is questioned on the basis that it is thought to violate the principle of plenitude with respect to the possible ways the world might be. See also Bigelow and Pargetter (1990) for an alternative version of reductionism.

e. Qualitative and Non-Qualitative Properties

Prima facie, it appears that properties such as being blue, having a mass of 1 kilogram, or being an electron are different in kind to being Barack Obama, being such that 4 is an even number, and being the same weight as William Shakespeare, in the sense that the first set of properties apply to the individuals which instantiate in them in virtue of the qualities that individual has (and also, if they are extrinsic properties, in virtue of the qualities which other individuals have and the relations between them), while the latter do not. The latter class of properties include haecceistic properties, impure properties and identity properties (and disjunctions and negations of these), as well as arguably including modal and temporal properties (being possible, being actual, being now) and mathematical properties. (See 7f for some examples of these and further definitions.)

Can we draw a distinction between qualitative and non-qualitative properties, and is there a criterion according to which we can do so? The principled distinction would be a philosophically useful one, since the distinction is already employed in its intuitive formulation: it is qualitative properties, not non-qualitative ones, which are shared by duplicates. Langton and Lewis’s distinction between intrinsic and extrinsic properties also applies only to qualitative properties (1998, and see 7a); laws of nature are taken to connect qualitative properties rather than non-qualitative ones, and furthermore, inductive inferences are considered illegitimate if the terms within them refer to non-qualitative properties (Hempel and Oppenheim 1948). In addition, claims about the truth of physicalism are usually restricted to claims about the ultimately physical nature of qualitative properties.

There has been some contemporary philosophical consideration of this distinction (Diekemper 2009; Cowling 2015). Reductive analyses of non-qualitative properties have attempted to account for them in terms of the linguistic attributes of the predicates which apply to them (that they always include proper names, for example), or have attempted to characterise non-qualitative properties as being those whose existence necessarily requires the existence of specific individuals (Rosenkrantz 1979). While this latter account is plausible for many positive non-qualitative properties—for instance, being Barack Obama requires the existence of Barack Obama—it does not work as well for negative non-qualitative properties such as being distinct from Barack Obama, since such a property might exist in the absence of Barack Obama himself. Alternatively, one might suggest that qualitative properties are specifically those which can be defined in an appropriate way from perfectly natural properties, or are those which supervene on them (Bricker 1996). Cowling (2015) finds all these alternatives problematic and advocates a primitivist approach to the distinction.

f. Technical Terms for Property Types

Since there are several specialised technical terms for different types of properties, it will be useful to list them here.

Property Name Description Examples
Qualitative properties,

Pure Properties

 

General Qualities being green,

having mass,

being an armadillo,

being near an iceberg

Existential Properties Property that requires the existence of something or other (usually of a certain type) being such that a cat exists,

being such that a triangle exists

Haecceistic Properties, Identity Properties,

Impure Properties

Property which involves a particular entity being Marie Curie,

being 300km from Bamako,

not being written by David Lewis

Identity Properties A subset of haecceistic properties involving being a particular thing being Obama,

being Marie Curie

Indiscriminately Necessary Properties Property instantiated by every particular being such that 6 + 6 = 12,

being self-identical,

being such that a triangle exists

8. Realism about Properties: Do Properties Exist?

So far, this article has presupposed that properties exist mind-independently, or that at least some of them do. But this claim has been challenged for two main reasons. First, there are the concerns about there being constitutive identity and individuation criteria for properties which were raised in Section 2. Second, there are several interconnected epistemic worries about whether and how we are able to discover or to refer to the properties which exist mind-independently (Putnam 1981; Elgin 1995; Allen 2002). While these do not challenge the existence of properties directly, they remove some of the motivation for postulating that the world has objective qualitative joints of the kind which properties mark, since this motivation has traditionally been based upon the explanatory power which an ontology containing properties has. If we are not justified in our beliefs about which properties exist, it is hard to see how they can have any explanatory power.

Since such epistemic worries do not directly challenge the existence of properties unless one has a fairly strict requirement that the entities of our ontology be epistemically accessible to us, it remains open to the property theorist to advocate a kind of ‘Kantian humility’ about whether the properties which we think exist are the ones which there really are (Lewis 2009). If this attitude is acceptable, then properties can be employed in metaphysics whatever their epistemic relationship to us.

9. Properties in the History of Philosophy

Concern about how we should understand qualitative similarity was a prominent issue during several periods of philosophical history. Since the historical discussions of properties are varied and detailed, as well as sometimes being enmeshed with specific philosophical concerns of the time, it will be impossible to do justice to them here. Bearing this problem in mind, this articles is restricted to considering the very first known theories of properties and then summarise other notable points at which discussion about properties became prominent.

a. Ancient Theories of Properties

In the philosophical traditions of both ancient Greece and ancient India, the phenomenon of similarity and difference between distinct things prompted a certain amount of consternation which became bound up with the desire to explain the even more troubling phenomena of persistence and change. Early philosophers could see—on the basis of their everyday experience—that there were different things around them which were nevertheless the same: entities could be equal and yet unequal, a phenomenon which was in danger of being contradictory. Some philosophers postulated the existence of different elements or substances to account for these similarities and differences, which led to pre-Socratic accounts of the world in which one element is more important or more fundamental than the others; there is an archê or material principle in virtue of which the other substance types come into existence. For Thales, the archê is water; for Heraclitus (in some interpretations) fire; while others preferred pluralistic accounts of the elements, such as Empedocles’ four: earth, air, fire and water.

However, these accounts of different elemental substances stop short of being property theories because they do not have a conception of entities which can be co-located with each other—that is, that can be instantiated in the same spatio-temporal region as each other—and which also perhaps inhere in a more fundamental substance. Thus, in such theories, it is particularly difficult to explain the phenomenon of change. If one has only substances and no properties, the causation of one thing B by another A appears to be a case of substance A being destroyed and substance B being created: if one melts sand and salt together and gets glass, it appears that the sand and salt have been destroyed and the glass created. Each case of change or causation is a radical transformation, conceptually equivalent to the creation of one substance simultaneously with the destruction of another. Furthermore, it appears that the glass has been created from something which is not glass; it was not clear how to explain the coming-into-existence of such things from what they are not, or even how change is possible at all. The explanatory situation is arguably even more serious since it does not just affect cases of substantial change, such as salt and sand turning into glass, but also seemingly insignificant changes such as a hot cup of coffee getting cooler or a solid ice cube becoming liquid as it warms. (See ParmenidesOn Nature, specifically The Way of Truth, which denies the existence of both change and differences of type.) Such problems with change gave rise to fruitful metaphysical discussions, only fragments of which survive today, and generated what became the first theories of properties. How good an account of properties and change any of the pre-Socratics managed to give is therefore a matter of controversy, although Marmadoro (2015) argues that Anaxagoras treated kinds of substances as powers, and several commentators have ascribed a sophisticated account to Heraclitus (Finkelberg 2017).

Perhaps the most famous account of properties from Ancient Greece can be attributed to Plato, who formulated the theory of forms, the first known version of a theory of universals. Plato presented what became known as ‘the One Over Many argument’ in which he argued that many particular F-things could also be one if they are regarded as instantiating or participating in a universal F-ness (Republic, 596a). This accounts for how distinct particulars can be qualitatively the same by grounding their qualitative similarity in the universal which they all instantiate, and thus avoids the contradictory claim that such particulars are both the same and different, or that they are equal and unequal at the same time. For instance, different cats are the same because they instantiate the universal cat and are different because they are distinct individuals. Further differences can be grounded by universals which some of the cats instantiate and others do not, such as being tabby, being fat, or being feral. In addition, Plato argued that the forms must transcend the instances of them: first, because exact (qualitative) equality between different particulars cannot be experienced in nature and thus cannot be due to relations between the particular objects themselves; and second because there are some forms of which no perfect instances exist, such as the perfect circle, although examples of imperfect circles abound.

Following Plato, Aristotle accepted that objective similarity and difference is grounded by forms or universals, but he denied that such entities are transcendent. In his view, universals are immanent, wholly present in each of their instances, rather than being abstract entities which exist independently of them. Furthermore, Aristotle made a distinction between properties or attributes and the substance in which they inhere, or the particular which instantiates them. In this view, some of the philosophical mystery concerning change is dissipated since an entity can persist while the properties which it instantiates change. Water instantiates solidity and cold when it is frozen and liquidity and (comparative) warmth as it heats up, but the water continues to exist. Such an ontology maps conveniently onto the different grammatical elements of our ordinary language (at least if we speak a language with subjects and predicates and adjectives and nouns) with the substances being picked out as the subject or the object, and adjectives or predicates referring to the properties. Substance types such as cat, human, or water are further determined by particulars instantiating immanent universals, and we can understand substantial change—the creation of water, for instance, in a chemical reaction—by a change in the properties instantiated by matter.

Another contrast between Aristotle’s view and the earlier one of Plato is in the nature of the properties or universals they postulated: for Plato, universals can enter into causal relations (despite being abstract objects) but they are predominantly required to determine which category or type of thing a particular is; whereas, for Aristotle, universals have essential causal powers to bring about certain effects in the appropriate circumstances. For Aristotle, a particular’s instantiating a universal gives it the potentiality to have an effect, an effect which will be actualised if the particular is in the appropriate conditions. An ice cube has the potentiality to melt in appropriately warm conditions even if the particular ice cube is never in an environment greater than zero degrees Celsius. Aristotelian properties are essentially causal, which makes Aristotle’s view similar to that of the dispositionalists discussed in Section 5.

Early Indian philosophers encountered similar obstacles to the Greeks in attempting to understand the phenomena of persistence and change, which some early metaphysicians sought to alleviate by distinguishing quality from substance. For instance, Kaṇāda, founder of the Vaiśeṣika school, distinguishes three categories of existents: substance, quality and action, which together can provide an account of the constitution of the cosmos and the change within it (Kaṇāda, Vaiśeṣika Sūtra 8.14). Vaiśeṣika metaphysics, in conjunction with the broadly speaking metaphysical realist Nyāya epistemological system founded by Akṣapadi Gautama, provides a sophisticated account of real and existent particulars and real universals according to which particular substances, qualities and actions fall into categories. The Vaiśeṣikas consider what is existent to be a subset of the real: universals are real but not existent because they are objective, mind-independent entities rather than unreal or imaginary ones, but they do not exist in the same sense as individual objects or qualities. Particulars qualities are thus more fundamental than universals are for the Vaiśeṣika—the former exist and are real, whereas the latter are merely real—making Vaiśeṣika perhaps the earliest form of trope theory (Matilal 1990, ch. 4; Halbfass 1992, 122–7).

Universals are apprehended directly via perception and are eternal, unitary and located in a plurality of things; that is, like Aristotle’s account of them, they are immanent in that a universal is wholly present in every particular which instantiates it. Particular cows, or particular colours, or particular academic institutions, fall into the categories which they do because of the universals which they instantiate. Moreover, such universals can be further distinguished according to whether they determine natural or conventional classifications: cows and colours would be categorised as natural universals (jāti) while being an academic institution is an imposed classification (upādhi), determined as a matter of convention.

In common with objections to other, much later accounts of immanent universals (Armstrong 1978b), the early Buddhist philosopher Diṅnāga raised an objection to the Nyāya-Vaiśeṣika conception of a universal on the basis that a unitary entity’s being wholly present in multiple locations is incoherent. In the tenth century, Udayana attempted to provide a strict distinction between natural and imposed universals, and also placed restrictions upon the natural universals so that they could not fall foul of the problems associated with instantiation and self-instantiation noted below in Section 5 (Udayana, Kiraṇāvalī). The development of this metaphysics of properties then continued in the school of Navya-Nyāya (or New Nyāya). See, for instance, Annambhaṭṭa’s The Manual of Reason.

b. Medieval Theories of Properties

The subject of properties came to the fore once again in 12th Century Western European philosophy, and questions about what grounds qualitative similarity became important. Peter Abelard and Guillaume de Champeaux debated the nature of universals, with the former developing a form of nominalism, the view that universals are not objectively existing entities but are names, or irrealism which did not seek to determine the ontological status of universals at all. Abelard argued that realism about universals inherited from Boethius is incoherent since the instantiation of a universal by otherwise very different particulars would lead to contradictions. Both a frog and Aristotle instantiate the universal animal, but that makes it both irrational and rational, which is a contradiction. William of Ockham also formulated a version of nominalism which is sometimes regarded as an early trope theory

The rediscovery of the works of Aristotle in Western Europe from the middle of the 12th Century onwards also encouraged the ongoing debate. William of Ockham formulated a version of nominalism which is sometimes regarded as an early trope theory, and Aquinas adopted aspects of Aristotle’s theory of universals and incorporated into them Aristotle’s notion of causal powers in order to explain qualitative similarity, the nature of change and natural necessity.

c. Properties and Enlightenment Science

The European Enlightenment changed the focus of discussions about properties away from ontological worries about what properties are towards concerns about how properties fit in with our scientific worldview. One result of this change of focus was the development of a distinction between properties which has become known as ‘the primary and secondary quality distinction’. Most famously espoused in the work of John Locke, the distinction was inherited by Locke from Galileo, Malebranche and Boyle, and was widely held in some form by scientists of the time who began to distinguish those properties which are perceived exactly as they exist in objects and those which are mediated by the senses (or in some versions of the distinction are entirely subjective). A tomato has the near-spherical shape objectively, but it does not have its red colour independently of being perceived by a conscious observer. Primary qualities, according to Locke, include Shape, Size, Motion, Number, Texture, and Solidity, while secondary qualities are Colour, Taste, Sound, Felt Texture and Smell. If there were no perceivers, the latter qualities would not exist, but that is not usually taken to imply that these qualities are entirely subjective and do not in any sense exist in the objects which appear to instantiate them. Rather, as Locke maintains, there is a causal relationship between the objects and our sensory system such that secondary qualities are caused by the primary qualities of objects with the effects being mediated by the senses; secondary qualities ‘are powers to produce various sensations in us’ (Locke, 1689, VIII, §10).

A second feature of early modern property theories involved growing empiricist distrust of the Aristotelian conception of properties as being causal powers, entities which make effects occur (in the appropriate circumstances) and thereby ground natural necessity. Most famously, David Hume found nothing in sensory experience—no corresponding sensory impression—which indicated the existence of necessary connexions in nature of the variety which causal powers might ground. For the strict empiricist, there is no reason to believe in the existence of unactualized possibilities or potentialities—potentialities which have not manifested their effects—when all which can be observed are the actual effects when they occur. I can never experience the potential of a sugar cube to dissolve in water; I can only observe its dissolving when it actually does so. For the strict empiricist, powers or potentialities are mysterious features of objects, beyond our possible experience, and so we should not postulate their existence.

10. References and Further Reading

  • Allen, S. R.  2002. Deepening the controversy over metaphysical realism. Philosophy 77: 519541.
  • Allen, S. R.  2016. A Critical Introduction to Properties. London: Bloomsbury.
  • Annambhaṭṭa. Edited and translated by G. Bhattacharya. 1983. The Manual of Reason. Calcutta: Progressive Publishers.
  • Armstrong, D. M. 1978a. Universals and Scientific Realism. Volume 1. Cambridge: Cambridge University Press.
  • Armstrong, D. M.  1978b. Universals and Scientific Realism. Volume 2. Cambridge: Cambridge University Press.
  • Armstrong, D. M. 1980. Against ‘Ostrich Nominalism’. Pacific Philosophical Quarterly 61: 440–9.
  • Armstrong, D. M. 1983. What is a law of nature? Cambridge: Cambridge University Press.
  • Armstrong, D. M. 1989. Universals: An Opinionated Introduction. Boulder, CO: Westview Press. (pp 75–112 reprinted as ‘Universals as attributes’ in Loux (ed.), 2001: 65–91.)
  • Armstrong, D. M. 1992. Properties. In Mulligan (ed.), 1997: 14–27.
  • Armstrong, D. M. 1997. A World of States of Affairs. Cambridge: Cambridge University Press.
  • Armstrong, D. M. 1999. The causal theory of properties: properties according to Shoemaker, Ellis, and others. Philosophical Topics 26: 25–37.
  • Armstrong, D. M. 2004. Four Disputes about Properties. Synthese 144: 309–20.
  • Batterman, R. 1998. Why Equilibrium Statistical Mechanics Works: Universality and the Renormalization Group. Philosophy of Science 65: 183–208.
  • Bauer, William A. 2011. An argument for the extrinsic grounding of mass. Erkenntnis 74: 81–99.
  • Bealer, George. 1982. Quality and Concept. Oxford: Oxford University Press.
  • Bennett, Karen. 2017. Making things up. Oxford: Oxford University Press.
  • Bigelow, John, and Pargetter, R. 1990. Science and Necessity. Cambridge: Cambridge University Press.
  • Bird, A. 2007. Nature’s Metaphysics. Oxford: Oxford University Press.
  • Bird, A. 2014. Human Kinds, Interactive Kinds and Realism about Kinds. Unpublished Manuscript.
  • Bird, A. 2017. Manifesting Time and Space. In Jacobs (ed.), 2017: 127–138.
  • Black, R. 2000. Against quidditism. Australasian Journal of Philosophy 78: 87–104.
  • Borghini, A. and Williams, N. E. 2008. A dispositional theory of possibility. Dialectica 62: 21–41.
  • Boyd, R. 1991. Realism, Anti-Foundationalism and the Enthusiasm for Natural Kinds. Philosophical Studies 61: 127–148.
  • Boyd, R. 1999. Homeostasis, Species, and Higher Taxa. In Wilson (ed.), 1999: 141–186.
  • Braddon-Mitchell, D. and Nolan, R. (eds.). 2009. Conceptual Analysis and Philosophical Naturalism. Boston, MA: MIT Press.
  • Bradley, F. H. 1893. Appearance and Reality. London: Swan Sonnenschein.
  • Bricker, P. 1996. Isolation and unification: The realist analysis of possible worlds. Philosophical
  • Studies 84: 225–238.
  • Broad, C. D. 1933. Examination of McTaggart’s Philosophy: Vol. 1. Cambridge: Cambridge University Press.
  • Carnap, R. 1928. The Logical Structure of the World. Berkeley: University of California Press.
  • Carnap, R. 1936–7. Testability and Meaning. Philosophy of Science 3: 419–471 and 4: 1–40.
  • Cartwright, N. 1989. Nature’s Capacities and their Measurement. Oxford: Oxford University Press.
  • Choi, S. 2008. Dispositional Properties and Counterfactual Conditionals. Mind 117: 795–841.
  • Contessa, G. 2015. Only powers can confer dispositions. Philosophical Quarterly 65: 160–76.
  • Correia, F. 2007. (Finean) Essence and (Priorean) Modality. Dialectica 61: 63–84.
  • Cowling, S. 2013. The Modal View of Essence. Canadian Journal of Philosophy 43: 248–266.
  • Cowling, S. 2015. Non-Qualitative Properties. Erkenntnis 80: 275–301.
  • Denby, D. 2014. Essence and Intrinsicality. In R. Francescotti (ed.), 2014: 87–109.
  • Devitt, Michael. 1980. ‘Ostrich Nominalism’ or ‘Mirage Realism’. Pacific Philosophical Quarterly 61: 433–9.
  • Diekemper, J. 2009. Thisness and events. Journal of Philosophy 106: 255–276.
  • Ehring, Douglas. 2011. Tropes: Properties, Objects and Mental Causation. Oxford: Oxford University Press.
  • Elgin, Catherine Z. 1995. Unnatural science. Journal of Philosophy 92: 289–302.
  • Ellis, B.  2001. Scientific Essentialism. Cambridge: Cambridge University Press.
  • Fales, Evan. 1990. Causation and Universals. London: Routledge.
  • Fine, K. 1994. Essence and Modality. Philosophical Perspectives 8: 1–16.
  • Fine, K. 2000. Neutral Relations. Philosophical Review 199: 1–33.
  • Finkelberg, A. 2017. Heraclitus and Thales’ Conceptual Scheme. Leiden: Konninklijke Brill.
  • Francescotti, Robert. 1999. How to define intrinsic properties. Noûs 33: 590–609.
  • Francescotti, Robert. (ed.) 2014. Companion to Intrinsic Properties. Berlin: De Gruyter.
  • Frege, Gottlob. 1884. Die Grundlagen der Arithmetik. Translated by J. L Austin (1950, second edition 1968) as The Foundations of Arithmetic. Evanston, IL: Northwestern University Press.
  • Gautama, Akṣapadi. Nyāya Sūtra.
  • Goodman, N. 1954. Fact, Fiction and Forecast. Cambridge, MA: Harvard University Press.
  • Halbfass, W. 1992. On being and what there is: Classical Vaiśeṣika and the History of Indian Ontology. Albany: State University of New York Press.
  • Handfield, T. 2005. Armstrong and the Modal Inversion of Dispositions. Philosophical Quarterly 55: 452–61.
  • Harris, R. 2010. How to define extrinsic properties. Axiomathes 20: 461–478.
  • Hawthorne, J. 2001. Intrinsic properties and natural relations. Philosophy and Phenomenological Research 63: 399–403.
  • Heil, John. 2003. From an Ontological Point of View. Oxford: Oxford University Press.
  • Heil, John.  2012. The Universe As We Find It. Oxford: Oxford University Press.
  • Hempel, C and Oppenheim, R. 1948. Studies in the logic of explanation. Philosophy of Science 15:
  • 135–175.
  • Hirsch, E. 1993. Dividing Reality. Oxford: Oxford University Press.
  • Hochberg, H. 1987. Russell’s Analysis of Relational Predication and the Asymmetry of the Predication Relation. Philosophia 17: 439–59.
  • Hume, David. 1777. (Third Edition: 1975.)  An Enquiry Concerning Human Understanding. Oxford: Clarendon Press.
  • Jacobs, Jonathan D. A powers theory of modality. Or, how I learned to stop worrying and reject possible worlds. Philosophical Studies 151: 227–48.
  • Jacobs, Jonathan D. (ed.). 2017. Causal Powers. Oxford: Oxford University Press.
  • Kaṇāda. Vaiśeṣika Sūtra.
  • Kim, Jaegwon. 1982. Psychophysical supervenience. Philosophical Studies 41: 51–70. Reprinted in Kim, 1993: 175–193.
  • Kim, Jaegwon. 1993. Supervenience and Mind. Cambridge: Cambridge University Press.
  • Kistler, M. 2002. The causal criterion of reality and the necessity of laws of nature. Metaphysica 3:
  • 57–86.
  • Langton, Rae and Lewis, D. 1998. Defining ‘intrinsic’. Philosophy and Phenomenological Research 58: 333–345.
  • Lewis, David. 1973. Counterfactuals. Cambridge: Harvard University Press.
  • 1983a. New work for a theory of universals. Australasian Journal of Philosophy 61: 343–77. Reprinted in Mellor and Oliver (eds.), 1997: 190–227.
  • Lewis, David. 1983b. Extrinsic properties. Philosophical Studies 44: 197–200.
  • Lewis, David. 1986. On the Plurality of Worlds. Oxford: Blackwell.
  • Lewis, David. 1994. Humean Supervenience Debugged. Mind 103: 473–390.
  • Lewis, David. 1997. Finkish Dispositions. The Philosophical Quarterly 47: 143–158.
  • Lewis, David. 2009. Ramseyan humility. In Braddon-Mitchell and Nolan (eds.), 2009: 203–222.
  • Locke, D.  2012. Quidditism without Quiddities. Philosophical Studies 160: 345–363.
  • Locke, John. 1989. An Essay Concerning Human Understanding.
  • Loux, Michael J. (ed.). 2001. Metaphysics: Contemporary Readings. London: Routledge.
  • MacBride, Fraser. 2011. Relations and Truth-Making. Proceedings of the Aristotelian Society CXI: 159–76.
  • Manley, D. and Wasserman, R. 2008. On Linking Dispositions and Conditionals. Mind 117: 59–84.
  • Marmadoro, Anna. 2010a. Do powers need powers to make them powerful? In Marmadoro (ed.), 2010: 337–352.
  • Marmadoro, Anna. (ed.). 2010b. The Metaphysics of Powers: their grounding and their manifestation. London: Routledge.
  • Marmadoro, Anna. 2015. Everything in Everything. Oxford University Press.
  • Marshall, D. 2016. The Varieties of Intrinsicality. Philosophy and Phenomenological Research 92: 237–263.
  • Martin, C. B. 1994. Dispositions and Conditionals. Philosophical Quarterly 44: 1–8.
  • Matilal, Bimal Krishna. 1990. Logic, Language and Reality. New Delhi: Motilal Banarsidass Publishing.
  • Maurin, Anna-Sofia. 2002. If Tropes. Dordrecht, The Netherlands: Kluwer Academic Publishers.
  • Maurin, Anna-Sofia. 2010. Trope theory and the Bradley regress. Synthese 175: 311–326.
  • McGowan, Mary-Kate. 2002. The Neglected Controversy over Metaphysical Realism. Philosophy 77: 5–21.
  • Mellor, D H. and Oliver, A. (eds.). 1997. Properties. Oxford: Oxford University Press.
  • Millikan, R G. 1999. Historical Kinds and the Special Sciences. Philosophical Studies 95: 45–65.
  • Molnar, G. 2003. Powers: a study in metaphysics. Oxford: Oxford University Press.
  • Moore, G E. 1919. External and internal relations. Proceedings of the Aristotelian Society 20: 40–62.
  • Mulligan, K. (ed.). 1992. Language, Truth and Ontology. Dordrecht, The Netherlands: Kluwer Academic Publishers.
  • Mumford, S. 1998. Dispositions. Oxford: Oxford University Press.
  • Mumford, S. 2004. Laws in Nature. London: Routledge.
  • Mumford, S. and Anjum, R. L. 2011. Getting Causes From Powers. Oxford: Oxford University Press.
  • Nolan, Daniel. 2014. Hyperintensional metaphysics. Philosophical Studies 171: 149–160.
  • Orilia, Francesco. 2000. Argument Deletion, Thematic Roles, and Leibniz’s Logico-grammatical Analysis of Relations. History and Philosophy of Logic 21: 147–162.
  • Orilia, Francesco. 2006. States of affairs. Bradley vs. Meinong. In Raspa (ed.), 2006: 213–238.
  • Orilia, Francesco. 2011. Relational Order and Onto-Thematic Roles. Metaphysica 12: 1–18.
  • Parmenides. On Nature.
  • Plato. Parmenides.
  • Plato. Phaedo.
  • Plato. The Republic.
  • Prior, Arthur N. 1949. Determinables, Determinates, and Determinants (I, II). Mind 58 (229): 1–20, 58 (230): 178–94.
  • Prior, E. 1985. Dispositions. Aberdeen: Aberdeen University Press.
  • Putnam, H. 1981. Reason, Truth and History. Cambridge: Cambridge University Press.
  • Quine, W. V. 1948. On what there is. The Review of Metaphysics. Reprinted in Quine, 1953: 1–19.
  • Quine, W. V. 1953 (Second Edition 1960). From a Logical Point of View. Cambridge, MA: Harvard University Press.
  • Quine, W. V. 1960. Word and Object. Cambridge, MA: MIT Press.
  • Raspa, V. (ed.). 2006.  Meinongian Issues in Contemporary Italian Philosophy. Frankfurt: Ontos.
  • Rodriguez-Pereyra, G. 2002. Resemblance Nominalism. Oxford: Oxford University Press.
  • Rosenkrantz, G. 1979. The pure and the impure. Logique et Analyse 88: 515–523.
  • Russell, B. 1903. The Principles of Mathematics. London: George Allen & Unwin.
  • Russell, B. 1905. On denoting. In Russell, 1994: 415–27.
  • Russell, B. 1924. Logical Atomism. Reprinted in his Logic and Knowledge: Essays 1901–1950, R C Marsh (ed.), London: George Allen & Unwin Ltd: 323–43.
  • Russell, B. 1994. The collected Papers of Bertrand Russell 4. London: Routledge.
  • Ryle, G. 1949. The Concept of Mind. London: Penguin.
  • Schaffer, J. 2003. Is there a fundamental level? Noûs 37: 498–517.
  • Schaffer, J. 2005. Quiddistic knowledge. Philosophical Studies 123: 1–32.
  • Schroer, Robert. 2013. Can a single property be both dispositional and categorical? The “Partial Consideration Strategy” partially considered. Metaphysica 14: 63–77.
  • Shoemaker, S. 1980. Causality and Properties. Reprinted in Mellor and Oliver (eds.), 1997: 228–254.
  • Sider, Theodore. 1993. Intrinsic properties. Philosophical Studies 83: 1–27.
  • Swoyer, Chris. 1982. The nature of natural laws. Australasian Journal of Philosophy 60: 203–223.
  • Udayana. Kiraṇāvalī
  • Vetter, B. 2015. Potentiality: From Dispositions to Modality. Oxford: Oxford University Press.
  • Wildman, N. 2013. Modality, Sparsity, and Essence. Philosophical Quarterly 63: 760–782.
  • Williams, Neil E. 2017. Powerful Perdurance: Linking Parts with Powers. In Jacobs (ed.), 2017: 139–164.
  • Wilson, Jessica M. 1999. How Superduper Does a Physicalist Supervenience Need to Be? The Philosophical Quarterly 49: 33–52.
  • Wilson, R (ed.). 1999. Species: New Interdisciplinary Essays. Cambridge, MA: MIT Press.
  • Yablo, Stephen. 1992. Mental Causation. The Philosophical Review 101: 245–280.
  • Zalta, Edward N. 1983. Abstract Objects: An Introduction to Axiomatic Metaphysics. Dordrecht: D. Reidel.
  • Zalta, Edward N. 1988. Intensional Logic and the Metaphysics of Intentionality. Cambridge, MA: MIT Press.
  • Zalta, Edward N. 2006. Essence and Modality. Mind 115: 659–693.

 

Author Information

Sophie Allen
Email: s.r.allen@keele.ac.uk
University of Keele
United Kingdom

Paulo Freire (1921—1997)

Freire
By Slobodan Dimitrov – own work CC BY-SA 3.0

Paulo Freire was one of the most influential philosophers of education of the twentieth century. He worked wholeheartedly to help people both through his philosophy and his practice of critical pedagogy. A native of Brazil, Freire’s goal was to eradicate illiteracy among people from previously colonized countries and continents. His insights were rooted in the social and political realities of the children and grandchildren of former slaves. His ideas, life, and work served to ameliorate the living conditions of oppressed people.

This article examines key events in Freire’s life, as well as his ideas regarding pedagogy and political philosophy. In particular, it examines conscientização, critical pedagogy, Freire’s criticism of the banking model of education, and the process of internalization of one’s oppressors. As a humanist, Freire defended the theses that: (a) it is every person’s ontological vocation to become more human; (b) both the oppressor and the oppressed are diminished in their humanity when their relationship is characterized by oppressive dynamics; (c) through the process of conscientização, the oppressors and oppressed can come to understand their own power; and (d) ultimately the oppressed will be able to authentically change their circumstances only if their intentions and actions are consistent with their goal.

Table of Contents

  1. Colonized Brazil
  2. Early Years
  3. Influences on Freire
  4. Literacy Campaign
  5. Philosophical Contributions
    1. Critical Pedagogy Versus the Banking Model of Education
    2. Internalization
    3. Conscientização
    4. Freedom
  6. Pedagogy of the Oppressed
    1. Chapter 1
    2. Chapter 2
    3. Chapter 3
    4. Chapter 4
  7. Exile Years
  8. Return to Brazil
  9. Working Assumptions
  10. Criticisms
  11. Legacy
  12. References and Further Reading

1. Colonized Brazil

In order to better understand Paulo Freire’s ideas and his work, it is important to consider the context from which Freire developed his philosophy. Freire’s context was the North Eastern region of Brazil from the 1930s through the 1960s. Brazil was a Portuguese colony from 1500 to 1822. As was the case with other American colonies, most of the Indigenous people of Brazil perished due to the harsh, forced labor conditions and because they did not have any immunity to European diseases. Some of the natives who survived were enslaved in engenhos (sugar mills). Since most of the Indigenous population died, the owners of the engenhos engaged in the practice of buying African people as slaves to work and to increase the production of sugar, which was one of the main Brazilian exports during the years Brazil was a Portuguese colony.

Most of the Brazilian population during the years of Portuguese colonization was of Indigenous and African descent. There was very little movement of Portuguese immigrants into Brazil. To the Portuguese, Brazil was primarily a commercial enterprise that allowed them to exploit the Brazilian resources in order to rival England and Holland economically. Newspapers were not published in Brazil until 1808, and literacy among the vast majority of Brazilians was simply nonexistent.

Freire’s life and work continues to ameliorate the aftermath of 400 years of colonization and slavery in the American continent. Slavery was officially abolished in Brazil in 1888 when Brazil experienced a period of economic growth after its independence from Portugal in 1822. However, even during the mid-20th century, the economic conditions for many Brazilians were so negative and the hunger they experienced so unbearable that many farmers sold themselves or members of their families into slavery in order to avoid starving.

2. Early Years

Paulo Reglus Neves Freire was born in Recife in 1921. Freire experienced firsthand the political instability as well as the economic hardships of the 1930s. Freire’s father died during the economic depression of the thirties, and as a young child, Freire came to know the crippling and dehumanizing effects of hunger. Young Freire saw himself being forced by the circumstances to steal food for his family, and he ultimately dropped out of elementary school to work and help his family financially. It was through these hardships that Freire developed his unyielding sense of solidarity with the poor. From childhood on, Freire made a conscious commitment to work in order to improve the conditions of marginalized people.

Freire managed to finish elementary school between Recife and Jaboatão and later attended the secondary school, Oswaldo Cruz, in Recife. Aluízio Pessoa de Araújo, the principal of Oswaldo Cruz secondary school, agreed to allow Freire to study at a reduced tuition because Freire’s family could not afford to pay the full tuition. To reciprocate the favor, Freire began to teach Portuguese classes at Oswaldo Cruz in 1942. Freire then went on to study law at Recife’s School of Law from 1943 to 1947.

3. Influences on Freire

Paulo Freire’s thought and work were primarily influenced by his historical context, the history of Brazil, and his own experiences. Some of the early and lasting influences on Freire were his parents, his preschool teacher, and Aluízio Pessoa de Araújo, the principal of Oswaldo Cruz secondary school. The ideas that contributed to the development of Freire’s philosophy and work are existentialism, phenomenology, humanism, Marxism, and Christianity. The ideas of G. W. F. Hegel, Karl Marx, Anísio Teixeira, John Dewey, Albert Memmi, Erich Fromm, Frantz Fanon, and Antonio Gramsci were Freire’s major influences.

Freire learned tolerance and love from his parents. Freire’s father died in 1934 due to complications from arterial sclerosis. Freire was 13 years old. Freire’s mother assumed the responsibility of providing for her four children. Even though Freire’s childhood was not an easy one due to the death of his father and the economic conditions of the 1930s, Freire’s parents had created an environment of tolerance and understanding in his home.

Eunice Vasconcelos was Freire’s preschool teacher, and she greatly influenced his understanding of school and learning. Because of this experience, Freire came to love learning, and he came to see school as a place where one is encouraged to explore one’s curiosity. Another important influence on Freire was Aluízio Pessoa de Araújo. Freire’s mother approached him to ask if young Freire could study at his school. The only problem was that Edeltrudes was not able to pay for Freire’s tuition. He accepted Freire into the school anyway because he was committed to teaching for the sake of helping people, and this proved to be a lasting influence on Freire.

Freire’s thought was deeply influenced by a number of G. W. F. Hegel’s ideas. Most notably are Hegel’s process metaphysics, social ethics, phenomenology, and the tension of the master versus slave dialectic. Throughout his writings, Freire makes the claim that the ontological vocation of all human beings is to become more human. While many of Freire’s readers and critics speculate that Freire assumes a substance metaphysics that reifies some types of human nature, other interpretations assume a Hegelian process metaphysics. If we assume the validity of this latter interpretation, then just as the unfolding of history culminates in Absolute Spirit for Hegel, similarly with Freire, it is the process of becoming that is important. Freire was also influenced by Hegel’s communitarianism and worked with individual students always with the aim of benefiting the community as a whole. Freire understood the importance of empowering individuals (positive rights) and protecting them (negative rights), which is a consequence of Freire’s understanding of the role, importance, and commitment to the betterment of the community. Freire also adopted phenomenology as his preferred method for not only making sense of his context, but also for figuring out a way to help his students learn about their own contexts. The emphasis on subjectivity from phenomenology was used by Freire to help his students understand their own realities through their learning of language, or as Freire called it, “the word,” and to learn together how to speak their word. Hegel’s tension of the master versus slave dialectic became for Freire the tension between the oppressor and the oppressed.

Karl Marx’s ideas were foremost influential on Freire’s own philosophy. Among the ideas from Marx that influenced Freire are Marx’s class consciousness, his concept of labor, and false consciousness. For Marx, when a person gains awareness of their class consciousness, they become cognizant of their economic place in their society and thus of their class interests. Freire’s concept of conscientização points to the process of becoming aware not only of one’s class, but also more broadly of the roles one’s race, gender, physical ability, and so forth play in our society. Freire, like Marx, believed that it is through our work that humans can change the world. Whether Freire’s students were construction workers, janitors, factory workers, or shoemakers, Freire used their work and the words for their tools both to teach them how to read and write as well as to share with his students how each of them transformed the world and made their world through their work. Just as Marx pointed to the spiritual loss from alienated labor that workers experienced, likewise Freire aimed to prevent this loss and restore human dignity to the work of his students by sharing with them the transformative power of their work. What Freire refers to as the internalization of a master has its basis in Marx’s concept of false consciousness. For Marx, false consciousness takes place whenever a member of the proletariat mistakenly believes that they are not being exploited, or that by working harder, they will some day gain economic stability and freedom. For Freire, Marx’s false consciousness takes place when the oppressed internalizes the ideology of the oppressor.

Freire was also influenced by Anísio Teixeira’s work and philosophy. Teixeira’s work called for the democratization of the Brazilian society through education. Teixeira opposed the education of his time, which was exclusive to the upper classes and thus promoted a social elitism that left the majority of Brazilians without access to education. Teixeira worked toward establishing a free, public, secular education that would be accessible for everyone. Freire was moved by Teixeira’s questioning of why the average Brazilian did not embrace a democratic spirit, and both Teixeira and Freire agreed this was due to the traditionally hierarchical and authoritarian ways in which people had related to each other during the time that Brazil had been a Portuguese colony, and afterward while slavery continued being an institution in Brazil. Freire, like Teixeira, believed and worked toward the possibility of developing a democratic sensibility through education.

John Dewey’s philosophy of education was another influence on Freire’s philosophy and work, particularly in the classroom dynamics, and the dynamic between the teacher and the students. Teixeira had been a student of Dewey, and the importance of fostering a democratic sensibility through education became central to Freire. Freire believed the classroom was a place where social change could take place. Freire, like Dewey, believed that each student should play an active role in their own learning, instead of being the passive recipients of knowledge. Consequently, Dewey and Freire both agreed that the ideal teacher would be open-minded and confident—confident in their competence while also open-minded to sharing and learning from his or her students. Both Dewey and Freire were critical of teachers whose dispositions were undemocratic, who transmitted information from the expert to the student, and who lacked curiosity and confidence to continue learning from their students.

Existentialism was another significant influence on Freire’s philosophy. Freire believed that human beings are free to choose and thus responsible for their choices. While on one hand, Freire did very much take into account the historical context created by the legacy of slavery in Brazil, he never believed the historical conditions determined the future for him, his students, or Brazilian society. On the contrary, Freire espoused the existential belief that humans need not be determined by the past. When Freire taught literacy classes, he not only taught his students how to read and write. Freire shared conscientização and, with this, the awareness that his students were free to choose the life they created for themselves.

Erich Fromm’s ideas also helped Freire discern how to bring about human liberation vis a vis the dominant ideology of Brazil at the time. Before Critical Theory, human reason was interpreted to be our source of rational, autonomous choices and enlightened dialogue. Marx problematized this assumption, however, when he pointed to false consciousness as one of the ways through which the dominant ideology becomes an instrument of domination that controls human choices and promotes alienation. Freire relied on Fromm’s understanding of human freedom and Fromm’s discussion of control to come to his own understanding of the dynamic between the oppressors and the oppressed. Like the existentialists before him, Fromm advocated the creation of human values instead of following pre-established and unquestioned norms. Freire was influenced by Fromm’s understanding of freedom to develop the liberatory praxis of critical pedagogy whereby the people in the classroom contributed to each other’s conscientização and thus embrace and claim their own freedom. In order to explain the difference between humanism and humanitarianism, Freire used the biophilic and necrophilic concepts from Fromm. In his book The Heart of Man (1967), Fromm distinguishes between two types of approaches to helping others. One approach is to feel the need to control the situation and the people who are being helped. The other approach is to allow the situation and the people to be what they potentially may be. Fromm characterizes the people who feel the need to control as necrophilic because in their need to control other people and the events in life itself, they deny people and life of their own possibilities. According to Fromm, those who are able to allow other people and events to unfold into what they may become are characterized as being biophilic because they respect the freedom and creativity of human beings and trust in the unfolding of life’s events.

The ideas of Albert Memmi and Frantz Fanon helped Freire to make sense first of the Brazilian and then the Latin American, African, and Asian colonized experience. Although Freire was deeply influenced by Marx’s analysis of economic classes, the Brazilian and Latin American histories could not be understood by class analysis alone due to the history of colonization and slavery. Freire agreed with Memmi that the primary reason for colonization was economic. Freire believed there were two reasons why the literacy rate was so low in northeastern Brazil. The first was because the Portuguese were primarily concerned with the economic exploitation of Brazil and its people. As was the case in other Latin American countries, Catholic priests did educate some of the people and advanced to some degree the interests of the natives; however, according to Freire’s understanding, and influenced by Memmi, the colonization of Brazil was first and foremost an economic endeavor. The exploitation of the land’s resources and the people’s labor through the institution of slavery and the aftermath of slavery was the second reason the literacy rate was extremely low. In agreement with Teixeira, Freire believed the lack of democratic sensibility and education in Brazil was precisely due to the history of colonization in Brazil.

Besides Memmi, Fanon was deeply influential in Freire’s understanding of the colonized experience. Perhaps the most salient influence of Fanon on Freire was Fanon’s idea that the oppressed must be actively engaged at every step of gaining their own freedom. In other words, the oppressed cannot and should not be liberated by anyone other than themselves. Fanon’s discussion of language, in his case the difference between “proper French” and his Creole French, also influenced Freire’s understanding and teaching of Portuguese in such a way that Freire always acknowledged the legitimacy of his students’ way of speaking the Portuguese language.

Freire’s philosophical development was also influenced by several of Antonio Gramsci’s ideas. Gramsci’s idea of the organic intellectual influenced Freire to believe in the importance of educating and fostering the development of his working-class students. Influenced by Fanon and Gramsci, Freire was committed to the idea and practice of legitimizing the experiences and knowledge of his students so that organic intellectuals would emerge. These organic intellectuals would in turn be in the best position to contribute to the solutions of the community’s problems since they would know their community, the intricacies of their context, and their problems and solutions better than any expert who had studied the problem merely academically.

Equally important to the theoretical influences here mentioned was the spiritual influence that Christianity had on Freire’s philosophy. Freire was particularly influenced by liberation theology as it developed in Latin America. Liberation theology prioritized fighting poverty, political activism, practice, and social justice. Freire’s philosophy was very much in line with the grassroots, bottom-up organization of liberation theology, which emphasized the importance of practicing the teachings of Jesus Christ instead of obediently following the established orthodox church hierarchy.

4. Literacy Campaign

Paulo Freire began to work with illiterate peasants and workers in the northeastern region of Brazil in 1947, and by the beginning of the 1960s, he had organized a popular movement to eradicate illiteracy. Due to the Portuguese colonization of Brazil, as well as the institution of slavery, the literacy level of most Brazilians was extremely low. The population of the northeastern region of Brazil in 1962 was 25 million, and of these, approximately 15 million were illiterate.

In 1947, when Freire was 26 years old and while he was still teaching language classes at Oswaldo Cruz secondary school, he began to work at the government agency called the Serviço Social da Indústria (SESI). He was appointed to work as an assistant in the Division of Public Relations, Education and Culture. The goal of this agency was to provide social services in the areas of health, housing, education, and leisure for the Brazilian working class.

Freire worked at SESI for 10 years, and during this time, he learned many important aspects about the Brazilian working class and Brazilian school system that informed how he would later develop as a teacher and political thinker. Freire worked closely with the schools, examining how policy was made and how it affected the quality of education for the students. It was during this time that Freire noticed how some of the Brazilian working-class parents were raising their children. Although Freire had been brought up in a tolerant environment, this was not the case in most other homes. Freire came to SESI with a democratic sensibility, however, he was met with what seemed to be a type of conditioned authoritarianism that affected how parents related to their children and how teachers approached their teaching. Physical punishment toward children was often used both by parents as well as teachers. Freire noticed that the harsh physical punishment the children were subjected to did not serve the intended purpose; instead, children were alienated from their parents and teachers, and an environment of harsh authoritarianism was more firmly established. Consequently, Freire began training teachers and parents to learn more tolerant ways of teaching and disciplining their children.

During the 10 years that Freire worked for SESI, he gathered many experiences that would later help him shape his doctoral studies and dissertation at the University of Recife. After his work for SESI, Freire accepted a position as a consultant for the Division of Research and Planning. It was during this time that Freire began to establish himself as a progressive educator. He conducted studies in adult education and marginal populations and presented these at national adult education conferences. His early ideas were of cooperative decision-making, social participation, and political responsibility.  Freire did not see education as merely a way to master academic standards or skills that would help a person professionally. Instead, he cared that learners understood their social problems and that they discovered themselves as creative agents. In 1959, Freire completed his doctoral dissertation titled Educacåo e Actualidade Brazileira (Present-day Education in Brazil).

In 1961, the mayor of Recife, Miguel Arraes, asked Freire to help develop literacy programs for the city. The goal of these programs was primarily to encourage literacy among the working class, to foster a democratic climate, and to preserve their Indigenous traditions, beliefs, and culture. It was during this time that Freire began to work with his cultural circles and found out just how damaging and pervasive the institution of slavery continued to be, even decades after slavery had been abolished.

Freire decided to use the name “cultural circles” instead of literacy classes. He had several reasons for this choice of words, and one reason was the negative connotation of the word “illiterate.” Although most of his students were, as a matter of fact, illiterate, no one wanted to describe or think of themselves as such. Another reason was that Freire’s project did not focus solely on teaching people how to read and write. At the time, literacy was one of the requirements for voting in presidential elections, and Freire meant to create a sense of political awareness by the methods he used to teach as well as the content he shared with his students.

The teachers of the cultural circles were deliberately not called teachers, but rather coordinators, and the students were instead called participants. Instead of traditional lectures, dialogue was encouraged. Freire chose not to use the traditional language primers because their content was often irrelevant to the cultural context of the peasants and the workers he taught. Instead, Freire began with the existential conditions of the learners. Of the coordinators, Freire required that they be driven by love, be guided by humility, and have great faith in the human potential. Freire asked that the coordinators consider education as a vehicle for liberation instead of domestication.

Also in 1961, João Goulart assumed the presidency of Brazil. Goulart was a populist leader, so when he was elected, many student groups, unions, and peasant leagues began to emerge. At the same time, a communist presence was more clearly felt in Brazil. It was partly because of these events that Freire transferred the cultural circles from the city of Recife to the Cultural Extension Service (SEC) in the University of Recife. From June of 1963 to March of 1964, Freire and his team trained college students and others who were interested on how to work with adult literacy learners. Freire planned to reach as much of Brazil as he could by establishing more than 20,000 cultural circles around the country. Freire’s plan was to teach five million adult learners within a two-year period how to read and write.

On April 1, 1964, a military coup that was supported by the CIA overthrew the Goulart administration. The mayor of Recife, Pelópidas Silveira, was arrested, Freire was discharged from his position, and all of Freire’s teaching materials were confiscated. Freire was subjected to a series of interrogations and accused of being a communist. He spent 75 days in jail, where he began to write his first book Educação como Practica da Liberdade (Education as the Practice of Freedom). The new military regime deemed Freire’s literacy project as subversive and stopped the funding for the project. Freire and his family were exiled from Brazil from 1964 to 1980. They first lived in Bolivia, then in Chile, where Freire continued his literacy project with Chilean farmers.

In the process of working with both Brazilian and Chilean peasants, Freire realized that even though people were no longer enslaved and had learned how to read and write, and in some cases were the owners of their own land, they did not consider themselves as being free. With this insight, one of Freire’s lifelong goals became to create the circumstances for his students to discover themselves as human beings, with their own agency as subjects and not objects, as members of a community, and as the creators of culture.

5. Philosophical Contributions

a. Critical Pedagogy Versus the Banking Model of Education

Paulo Freire’s philosophical views grew from his experiences as a teacher and the interactions he had with his students. Rather than continuing with the established cultural patterns of relating to people through a hierarchy of power, Freire’s starting point in the classroom aims to undermine the power dynamics that hold some people above others. Freire emphasizes that a democratic relationship between the teacher and her students is necessary in order for the conscientização process to take place.

Freire’s critical pedagogy, or problem-posing education, uses a democratic approach in order to reach the democratic ideal, and, in this sense, the goal and the process are consistent. He explains how the teacher who intends to hold herself at some higher level of power than that of her students, and who does not admit to her own fallible nature and ignorance, places herself in rigid and deadlocked positions. She pretends to be the one who knows while the students are the ones who do not know. The rigidity of holding this type of power dynamic negates education as a process of inquiry and of knowledge gained.

Freire is very critical of teachers who see themselves as the sole possessors of knowledge while they see their students as empty receptacles into which teachers must deposit their knowledge. He calls this pedagogical approach the “banking method” of education. This pedagogical approach is similar to the process of colonization, given that the colonizing culture thinks of itself as the correct and valuable culture, while the colonized culture is deemed as inferior and in need of the colonizing culture for its own betterment. The banking method is a violent way to treat students because students are human beings with their own inclinations and legitimate ways of thinking. The banking method treats students as though they were things instead of human beings.

Instead of the banking method, Freire proposes a reciprocal relationship between the teacher and the students in a democratic environment that allows everyone to learn from each other. The banking method of education is characterized as a vertical relationship:

teacher

student

The relationship developed through the banking method between the teacher and the students is characterized by insecurity, suspicion of one another, the teacher’s need to maintain control, and power dynamics within a hierarchy that are oppressive. The critical pedagogy that Freire proposes allows for a horizontal type of relationship:

teacher ↔ student

This relationship is democratic insofar as both the teacher and the student are willing and open to the possibility of learning from each other. With this type of relationship, no one is above anyone, and there is mutual respect. Both the teacher and the student acknowledge that they each have different experiences and expertise to offer to each other so that both can benefit from the other to learn and grow as human beings.

Instead of tacitly promoting oppressive relationships through the banking method of education, Freire chooses the process of critical pedagogy as his pedagogical model. This is because critical pedagogy utilizes dialogue among human beings who are equals rather than oppressive imposition.

Another negative consequence of the banking method is that students are not encouraged, and thus do not learn how to think critically, or to feel confident about thinking for themselves. The relationship between a student and a teacher who uses the banking method is similar to that of a farmer who obeys the orders of his/her boss. As was the case with the peasants with whom Freire worked, when a person’s day-to-day experience is dominated by another person or group of people, most of the dominated people are not capable of developing the ability to think, to question, or to analyze situations for themselves. Instead, their consciousness develops primarily to obey the orders imposed on them.

To promote democratic interactions between people, Freire suggests that teachers problematize the issue being discussed.  When issues or questions are problematized by teachers who work through critical pedagogy, readily made answers are not available.  Students realize that although some questions do have clear-cut answers, many of our deeper questions do not have obvious answers.  When students learn that teachers are human beings just as everyone else, and that teachers do not know everything but that they are also learners, students then feel more confident in their own search for answers and more comfortable to critically raise questions of their own.  The banking method denies the need for dialogue because it assumes that the teacher is the one who possesses all the answers and the students are ignorant and in need of the teachers’ knowledge. In order to problematize a subject, the teacher assumes a humble and open attitude. Given the teacher’s personal example, the students also become open to the possibility of considering the different positions being discussed. This promotes a dynamic of tolerance and democratic awareness because critical pedagogy undermines relationships where some people have power or knowledge, and some do not, and where some people give orders and others obey without questioning. Problematizing promotes dialogue and a sense of critical analysis that allows students to develop the disposition for dialogue not only in the classroom but also outside of it. This is of utmost importance because the disposition and value of dialogue spills over in a positive way to the students’ other relationships, at home, in the work place and in the community.

b. Internalization

Paulo Freire worked with people who came from a context of pervasive historical oppression. Most of his students came from families who had been previously enslaved, and Freire came to understand that abolishing slavery did not automatically mean that people were free. He also realized that teaching people how to read and write so they could vote in Brazilian elections, that is, enabling people through positive rights, was still not enough for people to realize their own freedom and end their oppression. Freire recognized that the oppression of a human being runs much deeper than political institutions and legal guarantees. He discovered that while we may actively seek our freedom, besides the institutional obstacles like colonization and dictatorships, there are also internal obstacles that prevent us from being free. The concept of internalization treated in this section is psychologically deep and rich in meaning.

In order to explain what internalization means, Freire writes about an incident in a Latin American latifundio (plantation) where a group of armed peasants took over the plantation. For tactical reasons they wanted to keep the landowner boss as a hostage. However, not a single peasant was able to keep guard over the boss because his very presence frightened them. Freire speculates that it is possible that the very act of fighting against their boss made the peasants feel guilty. Freire concludes that, in fact, the boss was “inside” them. These peasants had internalized their master. Although the boss was, as a matter of fact, overpowered by the peasants who outnumbered him, and was thus not in the position to give them orders or punish them if the peasants disobeyed, the peasants’ behavior was still driven out of fear of their boss. The freedom of the peasants was not merely contingent upon them physically removing their boss from the plantation, as they had initially believed. These peasants had been thoroughly conditioned to obey orders, to behave in a submissive way, to know and keep their “place,” which they did even when the boss was no longer in power.

Whenever we internalize our oppressors, we behave in the way the oppressor would have us behave even if they were not present. The example that Freire provides is a very telling one, and other common examples would be those of internalized racism or internalized patriarchy. To internalize racism, for instance, means that a racist person need not be present to oppress another—the person who has internalized racism behaves in a way that promotes the power of the oppressor and reifies the oppressive structure. An example of internalized racism in the 21st century would be dark-skinned people promoting whiteness, for instance by using whitening creams. An example of internalized patriarchy may be when a man feels like crying but does not because he does not want to seem weak. All of these are different ways in which people internalize an oppressive structure and then seek freedom and power within that structure. There are many other ways in which we internalize oppressive structures besides racism and patriarchy, such as our nationality, age, patterns of speech, weight, sexuality, or being able-bodied or disabled.

c. Conscientização

As previously mentioned, Paulo Freire worked with people who had been socialized within institutions shaped by the oppression of colonization. It bears repeating that although slavery was formally abolished in 1888, people continued to sell themselves into slavery during Freire’s time. Freire worked with the sons, daughters, and grandchildren of former slaves, and he noticed that the power dynamics of the institution of slavery continued to affect how people saw themselves and how they related to the people around them.

Conscientização is often described as the process of becoming aware of social and political contradictions and then to act against the oppressive elements of our sociopolitical conditions. This entails developing a critical attitude to help us understand and analyze the human relationships through which we discover ourselves. Conscientização usually begins with the individual person becoming aware of her own social context, political context, economic context, gender, social class, sexuality, and race and how these play an important role in the shaping of her reality. The process of conscientização also entails becoming aware of our agency to choose and create our reality.

Harriet Tubman, the African-American abolitionist, is known to have said that she would have freed more slaves, but the problem was that not all of them knew they were slaves. Tubman’s observation captures the heart of conscientização.” When a person or group of people has been socialized within an oppressive system such as slavery or patriarchy, it is often the case that the oppressed internalize the oppression and do not know that they are oppressed. To illustrate, before becoming politically aware, a woman, let us call her Jane, might behave by and within the norms of patriarchy all of her life. If, for instance, Jane applies for a promotion at work and the promotion is denied to her but is instead given to a less qualified and younger woman, Jane’s conscientização regarding sexism and ageism may begin.

Because of their history, socio-political, and economic contexts, the workers and peasants that Freire worked with were often not aware of the extent of their own oppression. Since they had been socialized to obey orders, to perform specific functions, and to not question authority figures, they were discouraged from following their own interests and from thinking for themselves. Freire noticed that his students would often think of themselves as objects instead of subjects and agents with the ability to choose their own destiny.

There are several steps in the process of conscientização. Freire worked with his students in his cultural circles and chose a curriculum that allowed him to help his students become aware of their socio-political realities. Freire began the process by creating the conditions through which his students could realize their own agency. He describes this first step as being able to identify the difference between what it means to be an object (a thing) and a subject (a human being). Once the first step of the process has been taken, namely the recognition of their agency, Freire emphasized to his students how the consequences of their choices did in fact shape their personal history as well as contributed to the creation of human culture. Equally important, Freire also highlighted the fact that every single human being has the ability to change the world for the better through their work. This was very important because it allowed common men and women to see their own self-worth. Given that their dialect, race, work, and culture were constantly demeaned by a system of oppression, Freire affirmed the worth of every person and that person’s work. Freire’s students came to see themselves as the makers of their own destinies, as confident shoemakers and weavers who created art, and whose culture and dialects were important and valuable.

d. Freedom

Paulo Freire writes about an instance when he asked his students what the difference was between animals and humans. The answers given to him are troubling and insightful. Before the peasants began the process of conscientização, they of course had the ability to become aware of their own agency, but they had not begun the process of conscientização, so they did not think of themselves as being free. When the students were asked about the difference between animals and humans, one of the peasants in the cultural circles in Chile responded that there was no difference between men and animals, and if there was a difference, animals were better off because animals were freer. According to this peasant, an animal enjoys a greater degree of freedom than a human being.

The peasant’s honest answer is indicative of how he saw himself and the context in which Freire worked. Although they were not legally enslaved, these peasants did not think of themselves as being free agents, as subjects with the option to choose and create their own lives and history. Instead, they saw themselves as objects upon whom orders were imposed, so the animals that were not required to follow orders were freer than them. In other words, for these peasants there was no real difference between them and the beasts of burden used to toil in the fields, unless the animal, a fox or bird for instance, was not used for farm labor. In this case, the animal had a higher degree of freedom than a human being.

These responses are indicative of the fact that the “freedom” of the peasants must be qualified. It is true that technically and politically they were no longer slaves. However, they did not think of themselves as being free human beings with their own agency and the ability to decide for themselves. Through working with the South American peasants in Brazil and Chile, Freire came to see that these peasants were not merely a marginalized group of people, but, worse than this, they saw themselves as existing solely for the benefit of their bosses, not as existing for themselves and for their own sake. Their social context had conditioned them into believing that the purpose of their being was only to benefit their bosses. Their economic and political contexts conditioned them to not see themselves as human beings (subjects), but rather as things or objects that exist merely to serve the bosses’ orders. The problem was not simply that they were illiterate but that they were completely alienated from their own agency. When Freire understood the extent of his students’ oppression, he chose to not only teach them how to read and write but also to create the conditions necessary in the classroom for the students to realize their own agency and come to see themselves as human beings. The process of conscientização is much more than learning a set of habits or skills. It is becoming aware of one’s own agency as a human being.

The concept of “freedom” has many connotations. Freedom may mean being able to move about freely or it may mean not being enslaved, for instance. Freire believed that “freedom” is the right of every human being to become more human. Freire noticed that “freedom” meant something different for the peasants with whom he worked. Freire explained that the peasants he worked with wanted land reform—not to be free, but rather to be able to own their own land and thus become landowners, or more specifically, the bosses of new employees.

Freire wrote how a peasant’s goal is in fact to be a free human being, but for them to be a free human being within the contradictory context in which they had been socialized and which they had clearly not overcome, meant to be an oppressor. Freire writes how the oppressed find in the oppressor their model of “manhood” or their model of humanity, of what it means to be a free person. The peasants had come to equate freedom with the ability to oppress others. This is because the context within which they lived dichotomized the boss as “free,” given that the boss was the one in charge and who commanded the peasants to follow his or her orders. The peasants were in turn dichotomized as not being free because they had no choice but to carry out the boss’s orders. Given this historical context, the only example the peasants had of what it meant to be a free person was the example of an abusive boss. Thus, the peasants came to believe their freedom could be only found by oppressing others.

Having the right to vote, to own property, to free speech, or to an education—though undeniably important—does not mean that a person is free. There are different ways in which people may be free, and freedom is a matter of degree. Contrary to the mainstream Western liberal belief, the fact that we are not enslaved physically does not mean that we are free, and it does not mean that we are not behaving the way our internalized oppressors would have us behave.

Freire adamantly opposed authoritarian relationships, which only cause further oppression. This is not merely for the sake of the oppressed, but also for the sake of the oppressors who become oppressed themselves through the dynamics of oppressive relationships. Freire writes how the fear of freedom is embodied by the oppressors but in a different way than by the oppressed. For the oppressed, the fear of freedom is the fear to assume or own up to their own freedom. For the oppressors, the fear is fear of losing the “freedom” to oppress.

6. Pedagogy of the Oppressed

Pedagogy of the Oppressed is Paulo Freire’s best-known work. He wrote it during his first years of exile from Brazil and published it in 1968. The book was translated into English in 1970. It has been banned and blacklisted numerous times by different governments who find the book to be subversive and dangerous. Among these governments was the South African government during Apartheid. In the United States of America in the 21st century, the book was banned from being taught in public schools in the state of Arizona under House Bill 2281.

Pedagogy of the Oppressed is divided into four chapters, and several important themes are developed throughout the book. Among these themes are how the oppressed and the oppressors are affected by the act of oppression, that liberation is a mutual process, the banking model of education, the incompleteness of human beings, generative themes and the use of cooperation, and unity and organization to liberate the oppressed.

a. Chapter 1

There are several important ideas elaborated in the first chapter of Pedagogy of the Oppressed. All of these ideas are developed throughout the book, and Paulo Freire comes back to these ideas throughout his later books and writings. The first thesis is that the dehumanizing situation under which many people live is not a given destiny but rather the result of unjust systematic oppression that fosters violence in the oppressors and dehumanizes the oppressed. Here, Freire makes one of his central theses, namely, that in their struggle to regain their humanity, the oppressed must not become the oppressors of their oppressors. Freire claims that it is only the oppressed who will be able to liberate both themselves and their oppressors by restoring the humanity of both groups.

Freire warns the oppressed against becoming oppressors on two counts: (1) whether the oppressed gain power and use this power to oppress their previous oppressor; or (2) in the case of the oppressed gaining power over other oppressed people and becoming their oppressors, as they seek their own individual liberation. The danger of a previously oppressed person becoming an oppressor is due to their ambiguous duality. Freire points out that the oppressed are at one and the same time both themselves (the oppressed) and the oppressor, whose consciousness they have internalized. Due to this ambiguous duality and the internalization of their oppressors, the oppressed seek to become like the oppressors and share in their way of life.

In this chapter, Freire also begins his criticism of charity versus social justice. Throughout Pedagogy of the Oppressed as well as throughout the rest of his life, Freire makes a distinction between charity and social justice. If social justice was in fact the existing state of affairs in society, Freire argues, there would be no need for charity. In this first chapter, Freire begins to discuss what he calls a false charity or a false generosity that is displayed by the oppressors toward the oppressed in the form of social programs and aid. However, Freire points out, the dispensers of this false generosity often feel threatened by those they claim they wish to help (the oppressed). This is a theme Freire maintains throughout his writings. Freire explains how the oppressors must perpetuate injustice in order for them to be able to express their false generosity. Freire develops this idea further in chapter four of Pedagogy of the Oppressed and comes back to it in in his Education for Critical Consciousness.

Freire also puts forth the thesis that freedom is acquired by conquest, that a person must claim their own freedom because freedom is not something that can be gifted to a person by another. This is a thesis that Freire continues to develop throughout his life. In this chapter, Freire begins by telling us that, oftentimes, members of the oppressors have a change of heart and seek to cease being exploiters of the oppressed. However, Freire warns us that the heirs of exploitation, due to their origin, almost always bring with them their prejudices. Because of their background, even when they seek to help the oppressed, they mistrust the people’s ability to transform their own circumstances and instead believe that they must be in control of the change that takes place. In other words, they still behave paternalistically and believe to know better than the people they falsely claim to respect.

Freire closes the first chapter of Pedagogy of the Oppressed by emphasizing how the oppressed must be intimately involved in each stage of their liberation. This is because, as he emphasizes, freedom is something each of us must claim for ourselves; freedom is not a gift to be given by some people to others.

b. Chapter 2

The most important idea that Paulo Freire develops in chapter two of Pedagogy of the Oppressed is the distinction between the banking model of education versus a critical pedagogy. Please see section 5a for a detailed explanation of this central Freirean concept and practice.

A central element of Freire’s pedagogy is dialogue, and he emphasizes its importance in this chapter. Freire prefers dialogue to imposition. He writes that it is love and respect that allow us to engage people in dialogue and to discover ourselves in the process. By its nature, dialogue is not something that can be imposed. Instead, genuine dialogue is characterized by respect of the parties involved toward one another. We develop a tolerant sensibility during the dialogue process, and it is only when we come to tolerate the points of view and ways of being of others that we might be able to learn from them and about ourselves in the process.

Freire believes that it is necessary for us to develop our tolerance of others so that all may learn from each other. However, tolerating others does not mean that one has to stop being who one is as one tolerates others’ behavior and ways of thinking. Dialogue and imposition are diametrically opposed approaches to relating to one another. According to Freire, imposition of our views upon others comes from a lack of confidence in our own beliefs. The person who either imposes or attempts to impose her views on others behaves in a life-denying manner insofar as she seeks to control others and insofar as she thinks in absolute terms with predetermined conclusions. Dialogue, on the other hand, comes from a place of tolerance. Dialogue can take place when we are comfortable with and confident in our beliefs and ourselves so that even if others disagree with us, we do not interpret their disagreement to mean that we are wrong. Dialogue is life-affirming and allows people and situations to be what they may become; it understands life and people as developing in an open-ended creative process. Instead of believing that “The Answers” or “The Truth” have already been determined, a person who engages others in dialogue believes that the answers and the truth will emerge as we listen and speak to one another. The control of the process comes through the development of the dialogue itself. Those who impose their views on others are afraid of losing their false sense of control. Dialogue, on the other hand, comes from a place of love, respect, trust, humility, and curiosity, and it assumes remaining open to change, to the tensions caused by uncertainty and the precarious, as well as to the further developments that unfold.

c. Chapter 3

In chapter three of Pedagogy of the Oppressed, Freire continues to develop his thesis on helping. He elaborates on the idea that those who educate, facilitate, or help in any way—be it social workers, research teams from universities, and so forth —must first learn to listen to and work with those whom they are helping. Freire is critical of professionals who have internalized the patterns of institutional domination in which they were socialized so that they come to believe that being in a position of power or having some form of institutional authority allows them to help the oppressed with top-down strategies and means. Freire’s criticism is that these “helpers” have come to believe that they have the right type of knowledge, the expertise, and the answers to what the people they are “helping” need, so that their approach to helping is from those who can and who know to those who have not been able to or who do not know:

Political leader/teacher/researcher/social worker
↓↓
students/community members being helped

The problem with this approach is that those who offer their help and expertise, those who are confident in their good intentions and qualifications, do not always trust that the ones who are the most knowledgeable of the problem and the solutions needed are the same people who need the help.

Relatedly, Freire makes a distinction between humanitarianism and humanism. Although both concepts mean well for the whole of humanity, they are not the same, nor do they achieve the same results. Freire was critical of social movements that pretend to give humanitarian aid. This was because he noticed that what oftentimes happens is that in the process of “helping,” the helpers rob the people being helped of their own agency to improve their own condition. There are ways to help people that promote the autonomy of the person or the group of people being helped and other ways of “helping” that impose our assistance on those who ask for our help. This is an important distinction because a humanitarian approach does not lend itself to dialogue insofar as the person in the helping position claims to know what the person in need of help needs and imposes the help. The humanist respects the person in need of help and offers help in such a way as to enable the person being helped to help herself.

Besides developing his thesis on helping, Freire also elaborates on what he terms “limit situations.” In his cultural circles, Freire began his literacy classes by making use of generative themes and words. These would be words such as tijolo (brick). The word would be broken down into its syllables (ti-jo-lo), then the students would practice enunciating the consonants coupled with vowels (ta, te, ti, to, tu; ja, je, ji, jo, ju; la, le, li, lo, lu) and then combine the syllables to generate new words. Sometimes the generative words would be “land,” “economy,” and “culture,” for instance. The facilitator and the students would not only break down the generative words into syllables, but they would also discuss their meanings. There would be times when, in the process of discussing certain generative words and themes, the class would come to a “limit situation.” These limit situations described a shared problem that the participants of the class and the facilitator, by working together, could overcome, for instance, putting up stop signs at intersections where they were needed.

d. Chapter 4

Paulo Freire is very critical of all liberation and populist movements that deny the oppressed the right to participate in their own liberation. Leaders of revolutionary movements cannot gift freedom upon the oppressed, nor can they temporarily use oppressive means to liberate them after the revolutionary movement comes to an end. Leaders are responsible for coordinating and facilitating dialogue among citizens, but, as Freire points out, leaders who deny the participation of the people they are trying to help effectively undermine their very goal to help.

Besides insisting that the solutions we seek come from problems rooted in our experience, Freire motions us toward adopting a pluralistic sensibility that respects the “other,” given that there is more than one way of being. A pluralistic sensibility is manifested through the tolerance we exercise during any dialogue. Democratic interactions are based on a type of faith in humanity, in the belief that all are able to discuss their problems, that is, the problems of their country, continent, world, work, and of democracy itself. In order to engage and be engaged by others in dialogue, it is necessary that we cultivate a sensibility of confidence, humility, and willingness to risk loving others and that we allow others to be who they are. Genuine dialogue is not possible without these values. Freire did not pretend to have any solutions other than to suggest that an open-ended dialogue could lead us to have a more just and humane world.

7. Exile Years

Paulo Freire lived in exile from 1964 to 1980 in Bolivia, Chile, the U. S. A., and Switzerland. Bolivia was the first country where Freire lived in exile from Brazil, but he only stayed in Bolivia for a brief time. Given that Freire had lived his whole life at sea level, the high altitude of the Andes did not settle well with him, and he had a very difficult time adjusting to the altitude of La Paz. Shortly after his arrival in Bolivia, a coup overthrew the administration of Victor Paz Estenssoro. Due to the political climate and the high altitude, Freire sought political asylum in Chile, where he lived from 1964 to 1969.

The five years that Freire lived in Chile proved to be very fruitful in terms of his writing and research. Freire was also able to continue and make advances with his work on literacy. Freire worked for the Instituto de Desarrollo Agropecuario (Institute for the Development of Agriculture) and with the University of Chile with the Department of Special Planning for the Education of Adults. Freire’s literacy model was successfully adopted, and this led Freire to participate in the Chilean agrarian reform effort. At this time, the United Nations Educational, Scientific and Cultural Organization (UNESCO) approached Freire to become a consultant, and Freire continued to assist the organization of cultural circles throughout Chile.

The five years that Freire lived in Chile were very good years for him and his family. The Chilean people came to love Freire and made him feel welcome. Working with the Chilean peasants was also very helpful to Freire insofar as his experiences with them allowed him to notice differences between the illiterate peasants in Brazil and Chile. Although their histories were similar, they were not the same people, and so Freire came to understand the experience of the oppressed more fully by also working with the Chilean peasants.

It was during his time in Chile that Freire was able to complete the manuscript of his first book, Educação como Prática da Liberdade (Education as the Practice of Freedom), which was published in 1967 in Rio de Janeiro. Freire was also able to write the manuscript of Pedagogy of the Oppressed based on his experiences in Brazil and Chile. Pedagogy of the Oppressed was first published in Spanish in 1968, and because of the political climate in Brazil, the book had to wait until 1975 to be published in Portuguese. By this time, Pedagogy of the Oppressed had already been translated to English, Italian, French, and German.

In 1968, Freire received invitations from Harvard University and the World Council of Churches (WCC) in Geneva Switzerland. He made the agreement to go to Harvard first and then to Geneva, departing from Chile in 1969 to live in Cambridge, Massachusetts, from April 1969 to February 1970. He taught at Harvard’s Center for the Study of Change and Social Development. During Freire’s time at Harvard, he worked as a visiting professor and gave lectures and conferences. He also published “The Adult Literacy Process as Cultural Action for Freedom” and “Cultural Action and Conscientization” in the Harvard Educational Review. These were later published as the monograph titled Cultural Action for Freedom in 1972.

Freire’s time in the U. S. A. allowed him to experience racism and discrimination first-hand as he saw the way people had to make do in the low-income housing and ghettos of New York City. These experiences, like the ones he had with the Chilean peasants, added to his Brazilian experiences and broadened his vision regarding the struggles of the oppressed. He understood that the third world and first world categories were not so clear cut, but rather that poverty and oppression could be found in developed countries as well.

After his time in the U. S. A., Freire lived in Switzerland, from 1970 until his return to Brazil in 1980. Freire worked for the World Council of Churches (WCC) as a consultant for the Office of Education and popular educational reform. In 1971, Freire, in collaboration with other Brazilian exiles, formed the Institute of Cultural Action (IDAC) in Geneva. The goal of IDAC was to bring about a pedagogical practice that brought awareness to the political dimensions of pedagogy. Through his involvement with the WCC and IDAC, Freire traveled to and worked in South and Central America, Africa, Australia, the Middle East, Asia, Europe, and North America.

Because of Freire’s deep interest in and empathy toward colonized countries, he followed closely the liberation struggles of African countries, specifically Mozambique, Angola, Cape Verde, São Tomé and Príncipe, and Guinea-Bissau. In 1975, the newly formed government of Guinea-Bissau invited Freire to help them organize a literacy campaign. Guinea-Bissau had been colonized by the Portuguese since 1440, and by 1975 they had a 90 percent adult illiteracy rate.

8. Return to Brazil

Paulo Freire lived in exile for close to 16 years, from 1964 to 1980. Upon his return to Brazil, he continued his work as an educator until his death in 1997. From 1980 to 1990, he worked at the Universidad de Campinas (UNICAMP) and as a professor in the Postgraduate Education program at the Pontifícia Universidade Católica de São Paulo (PUC-SP). In 1987, he was re-instated as Senior Professor at the Federal University of Pernambuco; however, Freire immediately retired from this position in order to make space for the younger generation of professors. At that time, Freire became Professor Emeritus at the Federal University of Pernambuco.

In 1980, Freire was intimately involved in founding the Partido dos Trabalhadores (PT) (Worker’s Party). This political party challenged the military rule and promoted democracy in Brazil. In 1989, Freire accepted an invitation to become the Secretary of Education for the city of São Paulo. During this time, São Paulo had 12 million people, with 720,000 students in 654 schools K-8. He served as Secretary of Education for two years, until May 1991. During this time, Freire began working toward improving the structural conditions of the buildings where the schools were housed. Besides the physical structures of the schools, he also worked to reform the schools’ curriculum in order to move toward engendering a school environment where students would be happy to learn and teachers would be encouraged to value the students’ backgrounds, cultures, values, interests, and languages. Freire was very sensitive to language discrimination, and he worked toward creating an environment where children would not be alienated due to their non-standard Portuguese dialects, ways of speaking, and syntax. After his retirement as Secretary of Education, Freire continued with his writing projects and went back to teaching in the Supervision and Curriculum graduate program at the Pontifícia Universidade Católica de São Paulo.

In October 1986, Elza, Freire’s wife and companion of 42 years, passed away due to cardiac failure. Freire was deeply affected by the loss of his wife and struggled with depression and grief. The following year, Freire began to slowly reengage himself with his work. He began to work as a consultant for UNICEF and resumed his teaching duties at the Pontifícia Universidade Católica de São Paulo. He also attended a symposium in Los Angeles to commemorate Elza’s life. There he met the educator and social activist Myles Horton, with whom Freire would collaborate to write the book We Make the Road by Walking: Conversations on Education and Social Change (1990). Collaborating on this book with Horton allowed Freire to reengage himself with his writing and eased the pain of losing his wife.

Two years after Elza’s death, Freire married Ana Maria (Nita) Araújo Hasche. Nita’s father was Dr. Aluízio Araújo, the principal of Oswaldo Cruz secondary school, where Freire had been allowed to study at a reduced tuition when he was a young man. Nita and Freire had known each other since then, and years later Freire served as one of Nita’s doctoral dissertation advisors at the Pontifícia Universidade Católica de São Paulo. An accomplished scholar in her own right, Nita contributed significantly to Freire’s later work and has continued to carry Freire’s vision forward, publishing several of his writings posthumously. Nita and Freire lived, loved, and worked happily until Freire passed away due to heart failure on May 2, 1997. He was 75 years old.

9. Working Assumptions

Besides the main philosophical contributions that were explored in section 5, Paulo Freire also thought about and developed other important ideas. These ideas are the working assumptions without which Freire’s work would not have been able to be developed. Although these ideas are just as important as his main philosophical contributions, these ideas are not usually given as much attention by Freire scholars. This section will briefly explain Freire’s working assumptions, namely, his view of human nature, authenticity, dialogue, and love.

Freire believed, as he often wrote, that the ontological vocation of every human being is to become more human. He believed that every person is always a work in progress, unfinished and open to further growth. This idea plays a central role vis a vis his other ideas because Freire worked from the assumption that people could change, learn, and grow to become better, more humane human beings. Freire’s idea of human nature allowed him to articulate his ideas regarding hope, which he believed was grounded on human beings’ incompleteness, beings who are unfinished and always in the process of becoming.

Another idea that played a central role in Freire’s philosophy was that of authenticity. Freire understood that the oppression the people he worked with had experienced had stunted their ability to live authentic lives and relate to the people around them in authentic ways. Especially at the beginning of his work, Freire noticed how many of the peasants he worked with had a deterministic view of history and their socioeconomic and political situations. Part of Freire’s goal was to help his students realize that their reality was not determined, but rather that history is made by one’s choices.

As mentioned, Freire observed that when a person internalizes an oppressor, it is difficult for her to be authentic. This is because when we internalize or host an oppressor, our intentions are split between our desire for freedom and the oppressive tendencies we have internalized, which means that we may feel the need to compete or oppress others in order for us to get ahead. Alienated from ourselves, our work, and other people, and due to the dehumanizing social structures that promote non-democratic relationships, living an inauthentic life may lead us to feel anxiety and potential meaninglessness.

Dialogue is another central working assumption for Freire, who encouraged people to be open, tolerant, and willing to learning from each other. For Freire, dialogue meant the presence of equality, mutual recognition, affirmation of people, a sense of solidarity with people, and remaining open to questions. Freire wrote in length about dialogue and dialogic relationships, which he characterized as loving, humble, hopeful, and exhibiting faith in humanity. Dialogue is the basis for critical and problem-posing pedagogy, as opposed to banking education, where there is no discussion and only the imposition of the teacher’s ideas on the students.

Love is perhaps the most central working assumption that Freire develops and continues to come back to throughout his many years of work. In a video documentary, Freire says of himself, “I’m an intellectual who is not afraid of being loving. I love people and I love the world, and it is because I love people and I love the world that I fight so that social justice is implemented before charity.” Freire wrote about the role that love plays in the commitment to a liberating education early on in Pedagogy of the Oppressed, where he wrote a section on Che Guevara and the feelings of love toward the Latin American peasants Guevara sought to liberate. Freire continued coming back to the role of love in education throughout his many writings until the end of his life. In one of Freire’s last books, Pedagogy of the Heart, he further explores the role of emotions in the process of conscientização. He believed that education was an act of love, and it thus required courage to be politically committed to work toward the empowerment of our students and belief in their potential.

10. Criticisms

There are several criticisms that have been made of Paulo Freire’s work and theories. The most common criticism that is made of Freire is due to his style of writing. Freire’s critics find his writing style to be verbose, cumbersome, and difficult to understand. Relatedly, Freire came under attack by feminists because in his earlier books Freire consistently used male pronouns and male examples. Unlike English, Portuguese is a gendered language, and although Freire was sympathetic to feminism, Freire’s writing was, like most of the writing at the time, dominated by male-centered examples and pronouns. Once Freire was made aware of this shortcoming in his writing, he revised the language of his earlier books in later editions and adopted a more gender-neutral style for the writing of his later books.

Another criticism that has been made of Freire’s work is that his pedagogical model and many of his theories regarding pedagogy are not transferable from the Brazilian third-world context where they were formulated. Although teachers in the U. S. A. have tried to work with Freire’s pedagogical model, the U. S. A. context is too different, his critics argue, from the one where Freire developed his ideas.

Additionally, Freire has been criticized for not fully espousing either Marxism, feminism, Catholicism, nor a militaristic approach to revolutionary change. Although Freire was sympathetic to certain elements of each of these approaches and set of beliefs, his insistence on the importance of dialogue frustrated many of his critics, who have attacked him for not having a concrete and practical method for helping people that could be used in different contexts. Freire has been criticized by leftists for his antireductionist approach and his insistence on dialogue, which in their opinion only slows down the change they want to bring about. Organizers of training events for teachers and social leaders would often invite Freire to help with the planning. Often these organizers became frustrated with Freire’s refusal to provide them with rules or a set of ready-made solutions to their problems.

11. Legacy

Numerous if not countless scholars, activists, politicians, and leaders have been influenced by Paulo Freire’s life and ideas. Among these are bell hooks, Cornel West, Angela Valenzuela, James H. Cone, Peter McLaren, Henry Giroux, Donaldo Macedo, Joe L. Kincheloe, Carlos Alberto Torres, Ira Shor, Shirley R. Steinberg, Michael W. Apple, Stanley Aronowitz, Leonardo Boff, and Jonathan Kozol.

Freire’s Pedagogy of The Oppressed has been influential the world over, and it has been translated into 17 languages. In the 21st century, it is considered to be too subversive for reading; it is one of the banned books in the state of Arizona (U. S. A.). Freire’s emancipatory model of teaching has been widely adopted in previously colonized countries and continents such as Latin America, Africa, Asia, the Philippines, India, and Papua New Guinea. Having been established to generate dialogue and support research into pedagogical approaches and theories, the Paulo Freire Institute is active in 18 countries. The World Bank funded the Southern Highlands Rural Development Program’s Literacy Campaign, which is based on a Freirean model of pedagogy.

Freire was presented with numerous medals, honorary degrees, and recognitions both during his lifetime as well as posthumously. Among these honors are the 1980 King Baudouin International Development Prize and the 1986 UNESCO Prize for Education for Peace. In 2008, Freire was inducted into the International Adult and Continuing Education Hall of Fame.

More important than all of the recognitions Freire received and the scholars he influenced, Freire’s life was his most significant legacy. His life’s example continues to inspire. He created the conditions by which thousands of people, the children and grandchildren of former slaves, could learn to read and write, learn about their agency and freedom, and learn to love.

12. References and Further Reading

  • Collins, Denis. Paulo Freire: His Life, Works & Thought. New York: Paulist Press, 1977.
    • Excellent short introduction to Paulo Freire’s life and philosophy.
  • Bakewell, Peter. A History of Latin America: Empires and Sequels 1450- 1930. Malden,   MA: Blackwell Publishers, 1997.
    • Latin American history, from colonization through independence.
  • Finn, Patrick J. Literacy With An Attitude: Educating Working-Class Children in Their Own Self-Interest. New York: SUNY Press, 2009.
    • Example of the banking model of education in the U. S. A.
  • Fonseca, Sérgio C. “Repercussões das ideias de Anísio Teixeira na obra de Paulo Freire.” Travessias, 2 (2008) 3-15.
    • Examination of Anísio Teixeira’s influence on Paulo Freire’s philosophy.
  • Freire, Ana Maria Araujo and Donaldo Macedo. The Paulo Freire Reader. New York: Continuum, 2000.
    • Presents Paulo Freire’s main ideas with an introduction written by Nita Freire.
  • Freire, Paulo. Education for Critical Consciousness. New York: Seabury Press, 1973.
    • Re-publication (in English) of Paulo Freire’s first book Education, the Practice of Freedom together with Extension or Communication.
  • Freire, Paulo. Education, the Practice of Freedom. London: Writers and Readers   Publishing Cooperative, 1976.
    • Paulo Freire’s first book, where he develops the banking model of education versus critical pedagogy.
  • Freire, Paulo. Extensión o Comunicación. Colombia: Editorial América Latina, 1974.
    • Paulo Freire discusses the better and worse methods to communicate between agronomic engineers and farmers.
  • Freire, Paulo. Letters to Cristina: Reflections on My Life and Work. New York: Routledge, 1996.
    • Series of autobiographical letters written to his niece discussing the events in his life and his philosophy.
  • Freire, Paulo. Pedagogy of Freedom: Ethics, Democracy and Civic Courage. Lanham, Rowman & Littlefield Publishers, 1998.
    • One of the last books that Paulo Freire authored, offering his most mature and insightful reflections. Also contains an informative and incisive foreword written by Donaldo Macedo.
  • Freire, Paulo. Pedagogy of the Heart. New York: Continuum, 2007.
    • Written toward the end of Paulo Freire’s life. Here he takes a look back at his work while still developing nuances to his concept of conscientização.
  • Freire, Paulo. Pedagogy of Hope: Reliving Pedagogy of the Oppressed. London:   Bloomsbury Academic, 2014.
    • This book is the “sequel” to Pedagogy of the Oppressed, where Paulo Freire explains the context and further elucidates the concepts he developed in Pedagogy of the Oppressed.
  • Freire, Paulo. Pedagogy of the Oppressed. New York: Continuum, 1970.
    • Paulo Freire’s most read book, where he develops the concepts of banking versus critical education.
  • Freire, Paulo. Teachers as cultural workers: letters to those who dare teach. Boulder: Westview Press, 1998.
    • Paulo Freire addresses teachers and encourages us to commit ourselves to continue being caring and open-minded.
  • Freire, Paulo. “The Adult Literacy Process as Cultural Action for Freedom.” Harvard Educational Review. 40:2 (1970) 205-225.
    • Paulo Freire’s article published in the U. S. A. during the time he taught at Harvard. Here he articulates the essence of Pedagogy of the Oppressed to the American academic audience.
  • Fromm, Erich. The Heart of Man: Its potential for good and for evil. Mexico: Fund, University Press, 1967.
    • Erich Fromm develops the biophilic and necrophilic concepts that influenced Paulo Freire.
  • Hooks, Bell. Teaching Community: A Pedagogy of Hope. New York: Routledge, 2003.
    • Critical pedagogy in the current U. S. American context.
  • Kirylo, James D. Paulo Freire the Man from Recife. New York: Peter Lang Publishing, Inc., 2011.
    • Excellent and thorough biography of Paulo Freire.
  • Martínez, Eusebio Nájera. “Paulo Freire – Fragmentos testimoniales de una praxis 3,” Online video clip. YouTube, 20 February 2010. Web. 28 August 2015.
    • Three-part documentary on Paulo Freire and his work.
  • Valenzuela, Angela. Subtractive Schooling: U.S. – Mexican Youth and the Politics of Caring, Albany: State University of New York, 1999.
    • An example of the banking model of education in the U. S. A.

 

Author Information

Kim Díaz
Email: kdiaz60@epcc.edu
El Paso Community College
U. S. A.

An encyclopedia of philosophy articles written by professional philosophers.